Code Monkey home page Code Monkey logo

firefox-scrapbook's Introduction

ScrapBook X is a legacy Firefox add-on that captures web pages to local device for future retrieval, organization, annotation, and edit. It is based on ScrapBook (by Gomita) and ScrapBook Plus (by haselnuss).

Features

  1. Save web pages faithfully: Web pages shown in the browser can be saved without losing any subtle detail. Metadata such as source URL and saving time are recorded for later reference.
  2. Save partial content: You can save partial web content. You can decide whether to save images, audio and video files, fonts, frames, styles, and/or scripts. You can decide how to process saved styles. You can edit the web content before saving. You can save a web page as a bookmark. ... And more ways for saving are available for you.
  3. Extensive save: You can save web pages and files linked by the web page, save multiple opened tabs, save a list of pages using a URL list, ..., and there are more batch saving functionality available for you.
  4. Manage data: You can manage saved items with a tree structure, just as easy as managing the bookmarks.
  5. Search data: You can search any fragment of the saved web pages with the built-in full-text engine.
  6. Edit data: You can add highlights, comments, annotations, or even edit the source html for the saved pages.
  7. Take notes: You can create note pages in ScrapBook, and edit them as easy as editing web pages.
  8. Input and output data: You can combine multiple data items into one. You can generate HTML tree list and make a static scrapbook site. You can configure multi-ScrapBook databases that won't interfere with each other. You can import and export data items for backup or exchange.
  9. Addons: Some Firefox add-ons can be integrated with ScrapBook to extend its power, such as these ones.

Installation

Download the .xpi file of a desired version in the releases list with a Firefox-like browser and you are done.

  • Be sure to disable or remove ScrapBook, ScrapBook Plus, or other similar add-ons to prevent a potential conflict.

  • ScrapBook X, as a legacy Firefox add-on, is not supported by Firefox Quantum (>= 57). It can still be installed in an older Firefox or a Firefox (Gecko) fork which still supports XUL/XPCOM, such as WaterFox, Basilisk, or Pale Moon.

  • Since Firefox 43, add-ons are required to be signed by Mozilla to be installable, while ScrapBook X > 1.14.5 are no more signed as Mozilla has stopped support of legacy add-on signing. To get the latest ScrapBook X work, use a Developer Edition, Nightly, ESR, or unbranded version of Firefox with xpinstall.signatures.required preference in about:config toggled false (read the documentation for details), or use an older Firefox version or a Firefox fork, as described in the previous point.

  • ScrapBook X is not compatible with Electrolysis (e10s). Be sure to disable e10s (check the Multiprocess-related fields in the about:support page) when using ScrapBook X or it may not function as expected.

Usage

For usage guide, further information, frequently asked questions, or other details, visit the documentation wiki.

Announcement

We are not going to implement support of e10s and WebExtension for ScrapBook X. For a WebExtension "port" of ScrapBook X, check WebScrapBook, which is our successor project of ScrapBook X and works on many modern browsers.

We will keep basic maintenance for ScrapBook X. However, new features or anything that requires a large code rework are unlikely.

firefox-scrapbook's People

Contributors

danny0838 avatar gomita avatar maxnikulin avatar misteralanbradley avatar yfdyh000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

firefox-scrapbook's Issues

Auto-find Folder for Bookmarking

Scrapbook Plus used to find the current folder when bookmarking a new page - the highlighted folder the cursor was on before bookmarking was indicated in the "Browse to Folder" dialog, making it very easy to save the bookmark to the correct folder. That behaviour seems to no longer be supported. Could you bring it back, PLEASE!

Firefox Android Support

Does this extension supported by Firefox Android? If so that will be great because that make it possible to sync the same ScapBook collection between PC and my Android Phone.

请问可以增加搜索文件夹名的功能吗

如果一件事建一个文件夹,将收集到的网页内容放入其中,这样很容易产生很多文件夹,因此需要搜索,请问可以在标题搜索和全文搜索加上搜索文件夹名的功能吗?

Enhancement request: Verbatim file/folder structure on disk

I'm using Scrapbook with sync solutions such as Dropbox and BTSync. They work on file level. If I modify File.A on one PC and File.B on another, they'll be synchronized. If I modify the same file in two places at once then the oldest version wins and I have to go resolve the conflict manually.

This is acceptable since it's rare and when I do modify one file from two places red light is already flashing in my head and I'm quite attentive to what I'm doing.

In Scrapbook, files are also tracked by scrapbook.rdf. So when I make changes to a folder structure from two places at once, only one change is kept, even if they were unrelated.

This, on the other hand, happens often. Running Firefox on two PCs at the same time is nothing unusual, and you can easily forget you moved some folder on one PC and that Firefox instance is still running (changes not commited/propagated/read back on another Firefox instance).

I'm not sure at which point Scrapbook saves / reloads scrapbook.rdf, surprisingly, I haven't encountered the problem all that often. Perhaps I'm careful enough. But the thought bugs me.

So I thought, why does Scrapbook even use one single file to track everything, when it basically just keeps a file/folder structure? That's what files/folders are for!

Why not just keep a verbatim folder/file structure on disk, just like it looks in Scrapbook?

There are many advantages to using verbatim structure:

  1. Simplicity. Removes part of the equation, expressing the functionality in simpler terms.
  2. Stability. Scrapbook.rdf is a point of failure, which is evidenced by the need to keep backups and "Restore scrapbook.rdf" function. Moreover, it's an artificial point of failure, as it's most often failure mode (afaik) is "data inconsistent with what is on disk". Just leave only what is on disk!
  3. Compatibility. All the tools in the world work with files as units of information and folders as a unit of grouping. Whatever cool enhancements build on that abstraction (such as sync solutions and perhaps others), Scrapbook would automatically benefit.
  4. Isolation. Changes are local, only affect that which was changed.
  5. Portability. One of the shortcomings of Scrapbook as a note library is that you cannot take it with you on Android/iPhone. But if the structure was stored verbatim, you can just sync the folder to Android and browse/edit with any of the number of tools.
  6. Openness. You don't need data in a special format to be used with Scrapbook. Ideally, you can just drop the .txt file into the Scrapbook folder, and it'll display it as a note (although without any Scrapbook-private features).

Why could scrapbook.rdf be needed in the first place?

  • To distinguish between note/page types? Lots of apps do this using just files. Use a file extension or if you insist on all files being HTML, add some meta tags into the header.
  • To store additional data? Store it as meta (or other) tags in HTML. HTML is designed to be extensible, why not use it.
  • Data which is too large/unwieldy for meta tags can also be stored in an external annotation file, the important point is that it's kept close to the main file, and that it's optional - if it's lost, the main file is still visible, but perhaps without some customizations.
  • To allow grouping of several files into one note? This can be done by straying just a bit from the verbatim structure. Let folders which have index.html inside be treated as files, where Scrapbook is concerned. Or do it the IE way where folders with the same name as HTML are treated as resource folders. (I'd go the latter way as it's a bit more compliant to established practice, but this point is not important)
  • Loading speed? But you don't need to load the whole tree immediately as Scrapbook starts. Just scan the root folder, and then scan folders in the background if you care, and force-scan a subfolder when the user expands it.

Since no one did this before, I assume there are some complexities here I'm not thinking about. But please at least consider the idea and if those can after all be solved! To me it seems that verbatim data structure on disk would be of benefit in many ways to advanced users, while changing almost nothing to Scrapbook users who just work through the interface.

Thanks!

Cannot compose email of mail.yahoo.co.jp if Scrapbook x installed

This problem happens on Windows7 Firefox31ESR and Firefox36+.

Steps To Reproduce:

  1. Install Scrapbook X with newly created profile
  2. Login mail.yahoo.co.jp
  3. Compose mail and attempt to edit message body

Actual Results:
No caret is shown. And I cannot input/edit text.

Disabled Scrapbook x solves the problem,
Original Scrapbook does not have the problem.

feature request: preview note as markdown

首先,非常感谢你改进这个扩展。我使用Scrapbook和Scrapbook Plus已经多年,虽然最初看上去Scrapbook X没有太多非常有吸引力的改进,但最后我发现 進入 HTML 編輯模式 这个小改动带来了很大的方便,尤其是跟新增的 筆記頁面 相结合的话。这就使得我在完全抛弃了以前用Scrapbook来摘取网页,用Evernote来记笔记(办公室场景下。反正公司的proxy不让我们将笔记同步到server上去)的习惯,全统一到Scrapbook上来了——不过对于HTML编辑,Scrapbook X只提供快捷键,并不是太方便,好在我们有另外一个扩展Page Hacker Revived

以前Scrapbook里面的 便签(筆記) 是个很奇怪的设计,在UI上提供的是一个纯文本框,只能用来输入纯文本(plain text),如果用外部html编辑器修改(添加了一些heading、表格、颜色之类),Scrapbook就没法显示它的内容了。但它最终保存下来是HTML文件,并且还有一个奇怪的”预览“功能(因为这个功能似乎没有能用得上的场景)。

所以我想,是不是可以将原来这种”便签”(筆記)改为按markdown来处理:存下来的内容可以仍然是纯文本,但预览的时候按照Markdown来渲染(render)?

cannot perform simple AND-ed search

Firstly, your add-on is awesome. Thanks for all the effort.

Now, suppose I want to :
find all the pages, notes etc.
whose any data (title, content, comments etc.) contain the exact words: Processor Interrupts Sec

Currently there is no way to do this easily (i.e. without regular expressions).
Following query doesn't work: Processor Interrupts "Sec"

Automate "Output HTML Tree"

As #9 (comment) mentioned, if it's possible to automatically output the HTML tree on database (rdf) change it would be more humane since we no more forget the action after any important change.

Nevertheless, there is currently some difficulties:

  1. Currently output is an async work done with a separate xul window and is not easy to make it as a completely isolated background process.
  2. The performance could be impaired once the process is run on every simple database change by having to look over all database items.
  3. How to define related parameters If it's automated? For example, how to define which folders should be exported (folders could be changed on database change)?

We'll need a further investigation to implement this.

How to change the background of the scrapbook tree

I personally don't like the white background of the ScrapBook tree - I find it to light. To change the background you can edit the file UserChrome.css. More information on this file you find here:.
To edit the file it is useful to install the firefox add-on ChromEdit Plus.
To change the background of the ScrapBook tree add this code to the UserChrome.css
#sbTree, #bookmarks-view, #historyTree {
background-color: #CCFFCC !important;
}
My background color is light green (#CCFFCC). Save your file and restart Firefox.

A note page sometimes becomes non-utf8 after saving

When we create a note page, a new tab is automatically created. However when we edit the note page in the tab and try to save it, we may sometimes receive a "the file is not utf8 encoding" error.

This seems to be somehow related with the notex_template.html - if the body element contains nothing, an empty html could be got in the the first launch, and the encoding definition in the meta is missing; if the body element contains something, the error seems never happen.

I need your help to further survey this problem.

Problems with DOM Inspector

Just stumbled uppon something weird,wonder if anyone can reproduce it:

  1. Save a page with Scrapbook X.
  2. Use the Dom Inspector to remove some elements of the page.
  3. Save the changes.
    Suddenly, the whole editing bar (highlight, delete text, DOM Inspector becomes "grey" i.e. not selectable - you cannot make any other changes. If you select another Scrapbook entry in the Side Bar, the browser ask "Leave the page" or "Stay on Page", if I choose Leave the page and again return, I can make modifications with the the edit bar again.

My Specs: Win XP,SP3, PaleMoon 24 portable, last version of Scrapbook X (ends with 0.35)

Any ideas?

Notification type

Somehow notifications of ScrapBook X differ from "usual" firefox notifications, they are shown in the system tray.
Could there be a preference on how the notification is shown?

I really like the feature to be notified if a page was captured before, but currently it pollutes my notification system tray.

In 1.12.0a25 images are always captured

Checkboxes "Capture images" and "Capture media" has no effect, even when they are turned off, all images from page are captured. Furthermore, even embedded TTF fonts are saved.

Better manage sticky notes and combined pages

Currently the stylesheets required for page sticky notes or for combined pages are stored in the chrome (i.e. the Firefox settings folder). The problem is that these stylesheets can only be retrieved by Firefox with ScrapBook addon installed, and the stylesheets will be missing if such pages are viewed via other browsers or Firefox with ScrapBook addon.

There are currently other strategies to store the stylesheets information, and each has its limitation:

1. via chrome://

Pros: (1) Saves most space. (2) Updates with ScrapBook version.

Cons: (1) Cannot be viewed correctly on other browsers.

2. via a stylesheet file in the ScrapBook folder

Pros: (1) One stylesheet file per ScrapBook folder. Saves space. (2) Allows customization for each ScrapBook folder.

Cons: (1) Cannot be viewed correctly if exported. (2) Requires manual removal to update to a newer version when updating ScrapBook.

3. via a stylesheet file in the data folder

Pros: (1) Well preserves the data pages. (2) Allows customization for each data folder.

Cons: (1) One stylesheet file per data folder. Space consuming.

4. via a style element in the data page

Pros: (1) Most exactly preserves the data pages.

Cons: (1) One stylesheet information per data page. Waste most spaces. (2) Cannot update to a newer version when updating ScrapBook.

I need more feedback for this problem. How do you think about this? Do you have a better idea?

Captured page is not normally accessible

Just encountered a problem with the following page
http://www.investopedia.com/university/financialstatements/financialstatements2.asp

Whatever the way of capture (whole page capturing, or only a multi-page selection), Scrapbook X reports the completition of the capture, but on the captured page there is no vertical scroll bar, and you cannot use the mouse wheel to scroll. You can scroll only with the up and down arrow keyes on the keyboard.
The normal, "live" page doesn't behave like this.
Just tested it with other sub-pages of investopedia - same problem.
Any ideas?

Tested with 0a41 and 0a42, PaleMoon Portable, Windows XP SP3.

EDIT: I thinkg the culprit in this case is the "Reorganise Styles" option in the capture dialouge box : If I deselect it, the page is captured "normally" ie you can scroll normally on the capture page.

一些使用上不夠人性化的部分

「擷取網頁細節」功能的選項沒有自動記憶,每次點選後都必須在設定一次
希望能增加過濾 iframe 的選項


沒有資料夾移動功能,目前只能利用手動拖移來移動資料夾,資料夾多的時候非常不方便


當選擇一個資料夾並且新增資料夾時,資料夾是產生在同級,但是正常來說應該要新增在子級才對


排序功能 希望能增加能指定只對資料夾 或者 只對檔案做排序的功能
另外希望能增加將資料夾排序在檔案之前的選項

Note Page title not set

Legacy Scrapbook Note takes its title as the first line of the note.
Note Page doesn't do this automatically.

Restart l10n

还放在 BabelZilla,或者放在 Transifex 等平台。方便及时更新。

Fix Arbitrary Source Rewrite

Including the rewrite of styles and meta tags on capturing a page and saving a page.

  • sbContentSaver.inspectCSSText in saver.js
  • sbPageEditor.savePage in editor.js

Multiple selection support in the main sidebar

As here mentioned, multiple selection in the main sidebar is requested.

Currently we have support of multiple selection and manipulation in the "Manage" window. However, due to a conflict on "Open in a new tab" and "Multiple selection" for Shift/Ctrl + click, multiple selection support in the main sidebar is pending for further investigation.

What should we do?

  1. No multiple selection. Keep the current "click = open", "shift/ctrl + click = open in new tab". (ScrapBook Plus 1.9.24.40b3 behavior)
  2. Add multiple selection. Keep the current "click = open", "shift/ctrl + click = open in new tab + multiple selection". Neglect the unintuitive behavior that shift/ctrl + click opens the clicked item. (ScrapBook 1.5.11 behavior)
  3. Add multiple selection, "click = open", "shift/ctrl + click = multiple selection". Cancel the feature of "shift/ctrl + click" for open in the new tab (alternatives: right-click > [Open in new tab], and middle-click if the mouse support is available).

We need your feedback for this. What do you think about this? Do you have another good idea?

[req] 最近資料夾、標籤

例如
顯示最近新增的項目,資料夾(也就是類似將擷取時會出現的最近資料夾 額外在放到工具內)

希望能有標籤功能

'Edit Before Capture' in context menu

Hi,

Any plans of adding the 'Edit Before Capture' in the context menu? It's available in the toolbar button's menu, but adding it to the context menu could be helpful IMO.

I use another add-on called OmniSidebar and I've added ScrapboobX's toolbar button in OmniSidebar's sidebar in order to have ScrapBookX's saved files immediately accessible. But since ScrapBookX's 'Edit Before Capture' is only accessible from its toolbar button it obliges me to open the sidebar... :) (life is tough!)

Thanks,

EDIT : I received an email with a response viewable here but here it does not appear ...

Scrapbook X and Scrapbook

Hello Danny,

Thank you for your work.
Today, firefox automatically updated the version of scrapbook X to scrapbook Version 1.5.12 from @gomita
Is there any problem of compatibility with firefox-scrapbook X?
Just to know, is there any coordination between you two or does these two versions evolve in parrallel?

Thanks,

AutoSave?

Any chance of getting auto save working with this? I've tried both of the versions that I can find and neither seems to work with this version of ScrapBook. For me it's the only must have feature that's missing.

enhancement request - 多層擷取可透過使用者自訂函數指定擷取區域

目前既有的功能:如果是當前頁面,使用者可以擷取選擇區域,但多層擷取卻必須整個網頁抓下來。(很可能廣告、回覆、評論是我不想要的)

多層擷取目前可設定項有三:層數、逾時、編碼;建議再增加一樣:擷取區域(textbox),可填自訂函數。

函數有一參數 url 為擷取頁面網址,傳回一物件 {
selector: '#foobar',
img: true,
style:true,
// 更多設定
},物件 selector 屬性為指定擷取區域,img 屬性為是否擷取該頁面圖片。

物件未設定之屬性,則以全域設定為預設值。若傳回 true,則以全域設定為預設值。若傳回 false, undefined ,則跳過該頁不擷取。

Access denied when removing data folders sometimes

Sometimes when I delete data items, I get an "access denied" error on removing the data folder(s), while the data items are correctly removed from the data list and the files in the data folder are all correctly removed. Next I try removing it via the "Calculate" dialog, and the result is the same. Last I launch the file explorer of the OS and delete the folders from it, it's done smoothly without any error.

So far all these situation only happened when the ScrapBook folder is in a Dropbox or Google Drive sync folder. I wonder it's because access from Firefox be blocked by the two sync softwares for some reason. If it's true, the problem is not ScrapBook but the two sync softwares.

I need your help to confirm this problem: Does an "access denied" error on deleting a data folder ever happen? And does it happen without using a Dropbox, Google Drive, OneDrive or other sync tools?

Scrapbook X captures less than Scrapbook on a certain page

I am using Scrapbook and Scrapbook X to capture a certain page.

I am trying to download a website with an overall design like:
http://www.MainPage.com/Subsite1
http://www.MainPage.com/Subsite2
etc.

Each subsite has pdfs, but they are linked in different ways.
ScrapBook easily handles them, when the link is “Subsite/filename.pdf”
But most of them are in a more complex format, like “Subsite/.pdf?title=&asset_type=FILETYPE/pdf…(a lot of file details follow)”, which I can’t make ScrapBook open.

Scapbook X captures all the pdfs, but it can't capture the links from MainPage to Subsite.
When I ask it to save the MainPage, it will start the capture dialogue, but freeze in the "Loading...20" stage
Scrapbook captures all links, but nok the pdfs.

Do you have any advice?

enhancement request - "search engine" and page results

Having scrapbook among the "search engines" available in the search dropdown, and then having the results display in a page search-engine style would be awesome. It would make Scrapbook into a mini-internet!

It would be cool too if the search results page had it's own search field, like a web search engine, and this page could be bookmarked even (although not really necessary if there was a search provided in the search dropdown).

Search results are missing .html

Search result preview doesn't show up, as all links are missing the extension .html.
The same happens when double clicking a search result (url in new tab is missing a .html).

[enhancement] voluntarily choose to ignore few file extensions

WWW is bloated.

Try clipping pages like with scrapbook X
https://addons.mozilla.org/en-US/firefox/search/?q=ScrapBook&appver=&platform=

Clipped webpage size: 1771.52 KB

go to clipped webpage local folder

delete file extensions:
*.eot
*.ttf
*.woff
*.svg

All font files are now deleted

Clipped webpage new size: 217 KB
Webpage functions properly.

88% reduction in size

It would be nice if scrapbook X has to setting to automatically ignore certain file extensions.

Updating search index is very slow and freezes the browser

I had to force close the browser after it got unresponsive for more than 10 minutes
ScrapBook 1.5.11 took about 20s to update the index.

Additionally it seems, that ScrapBook X does not update the index automatically if new pages were added and then a new search is started.

Cannot delete entry in Calculation Window

Don't know if this is intentional or not, but the delete button on right click on an entry in the Calculate Size window is greyed out, i.e. you cannot delete from there.
Tested with 0a39.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.