Linux (Mint) equivalent of Windows Search and capabilities?
Posted: Thu Feb 14, 2013 8:48 am
I am a very long time Windows user (currently on Windows 7, and not liking where things are headed with W8, at least as they appeared under Sinofsky) and I'm considering changing to Linux (specifically Mint) as my main desktop workstation. I'm not totally unfamiliar with Linux but I've never used it intensively as a desktop workstation. As you can guess I have a few questions and this one is about search/indexing tools. In particular, I am a very heavy user of Windows Search in Windows 7. It has several features that I also require from a Linux replacement and I'd like to establish what is available in Linux in general and Mint in particular.
First of all a bit of background. Windows Search is a much under-appreciated and very powerful tool that is integrated into Windows Explorer windows (i.e. file manager windows), file chooser dialogs, and the Start Menu (amongst other things). Apologies if any of this sounds like a Microsoft advert but here are some of the Windows Search (WS) features I currently use and need in a Linux replacement:
1) When WS is accessed from file manager windows (by typing into the search box integrated into each file Explorer/file manager window) and file chooser dialogs it searches file names, file content, and properties/metadata of indexed files in the current directory and subdirectories. Ideally I'd have similar UI/shell/file chooser integration with Linux.
2) One can save searches, narrow down searches, and use advanced query syntax to query content and/or metadata (e.g. properties). Exposed properties are themselves extensible by software developers. I'd like any Linux equivalent to be able to have the same ability to automatically search not just file names but content and metadata (as well as an optional advanced syntax to precisely control what is returned).
3) When WS is accessed from the Start Menu (by typing into the 'Search programs and files' box) it searches installed programs and Start Menu commands, as well as file names, content, and properties/metadata of indexed files, as well as the contents and properties/metadata of data stores on the machine (e.g. Outlook data store, and so on). This route into search can display summarised hits within the Start Menu itself or one can choose to see more complete hits in a customised file manager (Explorer) window where one can also narrow down the search terms, save a search, and so on. The equivalent on Linux doesn't have to be on the Start Menu or its equivalent but I'd need something similarly accessible and which can show complex, sortable, narrowable, results.
4) If you instruct WS to search a location that is not indexed then it will use its available filters to search files in that location in real time. I'd like the same from the Linux equivalent.
5) WS continuously updates its index of items in indexed locations as items as added, edited, moved, deleted, etc. I'd like the same on Linux of course.
6) WS is extensible: Anyone (well, anyone who is a programmer) can write a new text extraction or properties extraction filter for new file formats or for data stores. To be clear, a data store in this context is something like a mail program, where indexing individual files might not make sense or might lose context so that it makes more sense to treat the program's data as a database that can expose its content and metadata to the indexing/search system. Naturally I'd like to see the same sort of capability on Linux.
7) The tool needs to be well supported and be under active, ongoing development.
Ok, those are the main things I like about WS and are what I need from any search/indexing tool on Linux. It needs to be integrated into the desktop and file management environment, it needs to have advanced querying capability, it needs to be extensible, and so on. Obviously all of these things are technically possible on Linux! The real question is how far have they come in practice?
I looked at this a couple of years ago and I recall that Beagle looked like it was going in the right direction. However, I checked again more recently and I found that Beagle was no longer being developed (not least, it seemed, because it was written in Mono). So what is the state of the art of Linux desktop content/metadata/properties indexing?
P.S. Just in case you think I'm a complete Windows Search fanboy, it also has several drawbacks. One is the overly dumbed down UI in Windows 7 in my opinion, another is the poor user documentation, and another is Microsoft's lack of push to get software vendors to more actively support WS/IFilters/Property Handlers/Protocol Handlers for their file and data store formats.
First of all a bit of background. Windows Search is a much under-appreciated and very powerful tool that is integrated into Windows Explorer windows (i.e. file manager windows), file chooser dialogs, and the Start Menu (amongst other things). Apologies if any of this sounds like a Microsoft advert but here are some of the Windows Search (WS) features I currently use and need in a Linux replacement:
1) When WS is accessed from file manager windows (by typing into the search box integrated into each file Explorer/file manager window) and file chooser dialogs it searches file names, file content, and properties/metadata of indexed files in the current directory and subdirectories. Ideally I'd have similar UI/shell/file chooser integration with Linux.
2) One can save searches, narrow down searches, and use advanced query syntax to query content and/or metadata (e.g. properties). Exposed properties are themselves extensible by software developers. I'd like any Linux equivalent to be able to have the same ability to automatically search not just file names but content and metadata (as well as an optional advanced syntax to precisely control what is returned).
3) When WS is accessed from the Start Menu (by typing into the 'Search programs and files' box) it searches installed programs and Start Menu commands, as well as file names, content, and properties/metadata of indexed files, as well as the contents and properties/metadata of data stores on the machine (e.g. Outlook data store, and so on). This route into search can display summarised hits within the Start Menu itself or one can choose to see more complete hits in a customised file manager (Explorer) window where one can also narrow down the search terms, save a search, and so on. The equivalent on Linux doesn't have to be on the Start Menu or its equivalent but I'd need something similarly accessible and which can show complex, sortable, narrowable, results.
4) If you instruct WS to search a location that is not indexed then it will use its available filters to search files in that location in real time. I'd like the same from the Linux equivalent.
5) WS continuously updates its index of items in indexed locations as items as added, edited, moved, deleted, etc. I'd like the same on Linux of course.
6) WS is extensible: Anyone (well, anyone who is a programmer) can write a new text extraction or properties extraction filter for new file formats or for data stores. To be clear, a data store in this context is something like a mail program, where indexing individual files might not make sense or might lose context so that it makes more sense to treat the program's data as a database that can expose its content and metadata to the indexing/search system. Naturally I'd like to see the same sort of capability on Linux.
7) The tool needs to be well supported and be under active, ongoing development.
Ok, those are the main things I like about WS and are what I need from any search/indexing tool on Linux. It needs to be integrated into the desktop and file management environment, it needs to have advanced querying capability, it needs to be extensible, and so on. Obviously all of these things are technically possible on Linux! The real question is how far have they come in practice?
I looked at this a couple of years ago and I recall that Beagle looked like it was going in the right direction. However, I checked again more recently and I found that Beagle was no longer being developed (not least, it seemed, because it was written in Mono). So what is the state of the art of Linux desktop content/metadata/properties indexing?
P.S. Just in case you think I'm a complete Windows Search fanboy, it also has several drawbacks. One is the overly dumbed down UI in Windows 7 in my opinion, another is the poor user documentation, and another is Microsoft's lack of push to get software vendors to more actively support WS/IFilters/Property Handlers/Protocol Handlers for their file and data store formats.