Web Site Enhancement: Reduce sensitivity of "common words" filter
Forum rules
Do not post support questions here. Before you post read: Where to post ideas & feature requests
Do not post support questions here. Before you post read: Where to post ideas & feature requests
-
- Level 3
- Posts: 188
- Joined: Mon Nov 05, 2012 9:43 pm
- Location: Worcester Ma. (USA) when I'm not in Moscow Russia
- Contact:
Web Site Enhancement: Reduce sensitivity of "common words" filter
Issue:
When conducting a search on these fora, the search engine often selects and throws away words that are essential elements of the query because they are "common words".
A specific example in point:
Today I am experiencing a problem with certain windows not respecting the screen boundaries and leaving essential elements of the screen unreachable off the bottom of the screen. So I search for "window too big" - a generalized query because I'm not sure how others may word titles or text. The search engine decided to discard "too" and "big" because they were "common words" and I ended up with something like 4,000-and-some-odd results for the search term "window".
I understand that I can place my query within quotes which will preserve all my search terms. I also understand that simplifying the search query by removing inessential words is an important strategy for improving search speed and the relevance of returned items.
However, other users may not be so clever, and this may be part of the problem of people posting the same question for the 30,000th time - they don't know how to search effectively in a situation like this. So, they get either no results or too many results to effectively manage.
Likewise, I admit that this isn't the only forum site that suffers from this problem. It's only because I use these fora frequently that it irritates me here more than in other places.
May I suggest that the list of "common words" be significantly reduced? This way people performing searches will have a better chance of getting results that are more relevant to their query?
Respectfully submitted,
When conducting a search on these fora, the search engine often selects and throws away words that are essential elements of the query because they are "common words".
A specific example in point:
Today I am experiencing a problem with certain windows not respecting the screen boundaries and leaving essential elements of the screen unreachable off the bottom of the screen. So I search for "window too big" - a generalized query because I'm not sure how others may word titles or text. The search engine decided to discard "too" and "big" because they were "common words" and I ended up with something like 4,000-and-some-odd results for the search term "window".
I understand that I can place my query within quotes which will preserve all my search terms. I also understand that simplifying the search query by removing inessential words is an important strategy for improving search speed and the relevance of returned items.
However, other users may not be so clever, and this may be part of the problem of people posting the same question for the 30,000th time - they don't know how to search effectively in a situation like this. So, they get either no results or too many results to effectively manage.
Likewise, I admit that this isn't the only forum site that suffers from this problem. It's only because I use these fora frequently that it irritates me here more than in other places.
May I suggest that the list of "common words" be significantly reduced? This way people performing searches will have a better chance of getting results that are more relevant to their query?
Respectfully submitted,
Jim "JR"
Some see things as they are, and ask "Why?"
I dream things that never were, and ask "Why Not".
Robert F. Kennedy
“Impossible” is only found in the dictionary of a fool.
Old Chinese Proverb
Some see things as they are, and ask "Why?"
I dream things that never were, and ask "Why Not".
Robert F. Kennedy
“Impossible” is only found in the dictionary of a fool.
Old Chinese Proverb
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
Same thing for filtering any three letter word; in ICT most any specifically technical term is a three letter acronym. The forum search is useless and many have noticed, although no change has ever been implemented. The usual recommendation is to search through google while adding "site:forums.linuxmint.com" to the query.
-
- Level 3
- Posts: 188
- Joined: Mon Nov 05, 2012 9:43 pm
- Location: Worcester Ma. (USA) when I'm not in Moscow Russia
- Contact:
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
I'm not sure I'd be that critical, especially since most fora don't, and can't, afford the horsepower or manpower to duplicate sites like Google or Dogpile.
I am equally sure that the 90/10 rule applies here: A few (relatively) simple tweaks would probably improve things tremendously. And as interesting as it would be, I have neither the skill, nor the guts, to mess around with the search engine on Mint's live fora. (Though it might be both interesting and instructive to mess around on a test-copy of the system. . . .)
Maybe this is just ignorance speaking here, but I doubt that anyone would have to totally overhaul the search capabilities. Maybe turning off the "common words" filter would be a good way to start and gather data quickly?
What say ye?
Jim "JR"
Some see things as they are, and ask "Why?"
I dream things that never were, and ask "Why Not".
Robert F. Kennedy
“Impossible” is only found in the dictionary of a fool.
Old Chinese Proverb
Some see things as they are, and ask "Why?"
I dream things that never were, and ask "Why Not".
Robert F. Kennedy
“Impossible” is only found in the dictionary of a fool.
Old Chinese Proverb
- AZgl1800
- Level 20
- Posts: 11184
- Joined: Thu Dec 31, 2015 3:20 am
- Location: Oklahoma where the wind comes Sweeping down the Plains
- Contact:
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
Nearly all website search engines require 4 characters or they won't search for a term.
google Advanced Search does much better.
see this Advanced search for what you wanted
https://www.google.com/search?as_q=wind ... as_rights=
Window too big
viewtopic.php?t=171589
google Advanced Search does much better.
see this Advanced search for what you wanted
https://www.google.com/search?as_q=wind ... as_rights=
Window too big
viewtopic.php?t=171589
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
Not saying there can't be good reasons for the forum search being useless, just that is. If you just type e.g. "site:forums.linuxmint.com foo bar" into google the problem's solved...jharris1993 wrote: ⤴Mon Dec 17, 2018 2:08 amI'm not sure I'd be that critical, especially since most fora don't, and can't, afford the horsepower or manpower to duplicate sites like Google or Dogpile.
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
That's not true.
Please edit your original post title to include [SOLVED] if/when it is solved!
Your data and OS are backed up....right?
Your data and OS are backed up....right?
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
This board uses phpBB forum software and is limited to use on of the four search backends that phpBB works with (native phpBB search, or one of 3 databases' builtin search). Every backend has it's pros and cons. The search backend we're using is selected for our board size and load. It's limits are configured keeping in mind our server has to be able to handle the load. One of its configured limits is it only indexes words of at least 4 characters and it ignores English language stop words. So yes it's not perfect. We can configure it to index shorter words, but not without cost. This just highlights one of the issues with it though. People expect it to be as good as what their internet search engine can do and it just won't be anywhere close to that, regardless of minimum word length for the search index.
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
Not sure how easy it would be to implement in phpBB, but embedding Google Custom Search somewhere on the site might be an option? Not sure if it's the best idea, but maybe worthy of consideration.
https://cse.google.com/cse/
https://cse.google.com/cse/
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
Google CSE is possible with https://www.phpbb.com/customise/db/exte ... oglesearch and https://phpbb.hifikabin.me.uk/app.php/page/CGC. It would be a total replacement though and leave people that use the forums with Javascript disabled or blocking Google content without option to search. It also is a bit of a slap in the face of people trying to get out from under Google's search bubble.
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
Hi Xenopeek,
I agree with the bit about not wanting to rely on Google, but from what I can see on the phpBB extensions demo page the two search options can co-exist : https://phpbb.hifikabin.me.uk/.
The free version of GCS does throw up ads, but I guess LM could join Adsense and maybe make a little bit of revenue from it. Wouldn't create any millionaires though!
Anyway, not necessarily promoting the idea, just for clarification.
I agree with the bit about not wanting to rely on Google, but from what I can see on the phpBB extensions demo page the two search options can co-exist : https://phpbb.hifikabin.me.uk/.
The free version of GCS does throw up ads, but I guess LM could join Adsense and maybe make a little bit of revenue from it. Wouldn't create any millionaires though!
Anyway, not necessarily promoting the idea, just for clarification.
Re: Web Site Enhancement: Reduce sensitivity of "common words" filter
Ah, that's with it put in the navigation bar. I'm not too sure having two different search fields will make a lot of sense.