Web Site Enhancement: Reduce sensitivity of "common words" filter

Write suggestions and new ideas in here
More ideas here http://community.linuxmint.com/idea/welcome
Forum rules
  • Only post ideas here that are specifically about the Linux Mint distribution or its websites.
  • So that developers and users from any distribution can discuss ideas in one place, post ideas about improving software to the collaboration website for that software instead.
Post Reply
jharris1993
Level 3
Level 3
Posts: 123
Joined: Mon Nov 05, 2012 9:43 pm
Location: Woprcester Ma. (USA) when I'm not in Moscow Russia
Contact:

Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by jharris1993 » Sun Dec 16, 2018 11:31 pm

Issue:
When conducting a search on these fora, the search engine often selects and throws away words that are essential elements of the query because they are "common words".

A specific example in point:
Today I am experiencing a problem with certain windows not respecting the screen boundaries and leaving essential elements of the screen unreachable off the bottom of the screen. So I search for "window too big" - a generalized query because I'm not sure how others may word titles or text. The search engine decided to discard "too" and "big" because they were "common words" and I ended up with something like 4,000-and-some-odd results for the search term "window".

I understand that I can place my query within quotes which will preserve all my search terms. I also understand that simplifying the search query by removing inessential words is an important strategy for improving search speed and the relevance of returned items.

However, other users may not be so clever, and this may be part of the problem of people posting the same question for the 30,000th time - they don't know how to search effectively in a situation like this. So, they get either no results or too many results to effectively manage.

Likewise, I admit that this isn't the only forum site that suffers from this problem. It's only because I use these fora frequently that it irritates me here more than in other places.

May I suggest that the list of "common words" be significantly reduced? This way people performing searches will have a better chance of getting results that are more relevant to their query?

Respectfully submitted,
Jim "JR"

Some see things as they are, and ask "Why?"
I dream things that never were, and ask "Why Not".

Robert F. Kennedy

“Impossible” is only found in the dictionary of a fool.
Old Chinese Proverb

rene
Level 12
Level 12
Posts: 4434
Joined: Sun Mar 27, 2016 6:58 pm

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by rene » Mon Dec 17, 2018 1:39 am

Same thing for filtering any three letter word; in ICT most any specifically technical term is a three letter acronym. The forum search is useless and many have noticed, although no change has ever been implemented. The usual recommendation is to search through google while adding "site:forums.linuxmint.com" to the query.

jharris1993
Level 3
Level 3
Posts: 123
Joined: Mon Nov 05, 2012 9:43 pm
Location: Woprcester Ma. (USA) when I'm not in Moscow Russia
Contact:

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by jharris1993 » Mon Dec 17, 2018 2:08 am

rene wrote:
Mon Dec 17, 2018 1:39 am
The forum search is useless and many have noticed, although no change has ever been implemented.
I'm not sure I'd be that critical, especially since most fora don't, and can't, afford the horsepower or manpower to duplicate sites like Google or Dogpile.

I am equally sure that the 90/10 rule applies here: A few (relatively) simple tweaks would probably improve things tremendously. And as interesting as it would be, I have neither the skill, nor the guts, to mess around with the search engine on Mint's live fora. (Though it might be both interesting and instructive to mess around on a test-copy of the system. . . .)

Maybe this is just ignorance speaking here, but I doubt that anyone would have to totally overhaul the search capabilities. Maybe turning off the "common words" filter would be a good way to start and gather data quickly?

What say ye?
Jim "JR"

Some see things as they are, and ask "Why?"
I dream things that never were, and ask "Why Not".

Robert F. Kennedy

“Impossible” is only found in the dictionary of a fool.
Old Chinese Proverb

User avatar
AZgl1500
Level 11
Level 11
Posts: 3716
Joined: Thu Dec 31, 2015 3:20 am
Location: Oklahoma where the wind comes sweeping down the plains
Contact:

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by AZgl1500 » Mon Dec 17, 2018 4:11 am

Nearly all website search engines require 4 characters or they won't search for a term.

google Advanced Search does much better.

see this Advanced search for what you wanted

https://www.google.com/search?as_q=wind ... as_rights=

Window too big
viewtopic.php?t=171589

rene
Level 12
Level 12
Posts: 4434
Joined: Sun Mar 27, 2016 6:58 pm

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by rene » Mon Dec 17, 2018 4:27 am

jharris1993 wrote:
Mon Dec 17, 2018 2:08 am
rene wrote:
Mon Dec 17, 2018 1:39 am
The forum search is useless and many have noticed, although no change has ever been implemented.
I'm not sure I'd be that critical, especially since most fora don't, and can't, afford the horsepower or manpower to duplicate sites like Google or Dogpile.
Not saying there can't be good reasons for the forum search being useless, just that is. If you just type e.g. "site:forums.linuxmint.com foo bar" into google the problem's solved...

User avatar
Flemur
Level 17
Level 17
Posts: 7393
Joined: Mon Aug 20, 2012 9:41 pm
Location: Potemkin Village

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by Flemur » Mon Dec 17, 2018 11:47 am

AZgl1500 wrote:
Mon Dec 17, 2018 4:11 am
Nearly all website search engines require 4 characters or they won't search for a term.
That's not true.
Please edit your original post title to include [SOLVED] if/when it is solved!
Your data and OS are backed up....right?

User avatar
xenopeek
Level 24
Level 24
Posts: 24193
Joined: Wed Jul 06, 2011 3:58 am
Location: The Netherlands

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by xenopeek » Mon Dec 17, 2018 12:20 pm

This board uses phpBB forum software and is limited to use on of the four search backends that phpBB works with (native phpBB search, or one of 3 databases' builtin search). Every backend has it's pros and cons. The search backend we're using is selected for our board size and load. It's limits are configured keeping in mind our server has to be able to handle the load. One of its configured limits is it only indexes words of at least 4 characters and it ignores English language stop words. So yes it's not perfect. We can configure it to index shorter words, but not without cost. This just highlights one of the issues with it though. People expect it to be as good as what their internet search engine can do and it just won't be anywhere close to that, regardless of minimum word length for the search index.
Image

Mick-Cork
Level 3
Level 3
Posts: 113
Joined: Sun Mar 23, 2014 10:10 pm
Location: West Cork & London

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by Mick-Cork » Mon Dec 17, 2018 6:05 pm

Not sure how easy it would be to implement in phpBB, but embedding Google Custom Search somewhere on the site might be an option? Not sure if it's the best idea, but maybe worthy of consideration.

https://cse.google.com/cse/

User avatar
xenopeek
Level 24
Level 24
Posts: 24193
Joined: Wed Jul 06, 2011 3:58 am
Location: The Netherlands

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by xenopeek » Tue Dec 18, 2018 3:44 am

Google CSE is possible with https://www.phpbb.com/customise/db/exte ... oglesearch and https://phpbb.hifikabin.me.uk/app.php/page/CGC. It would be a total replacement though and leave people that use the forums with Javascript disabled or blocking Google content without option to search. It also is a bit of a slap in the face of people trying to get out from under Google's search bubble.
Image

Mick-Cork
Level 3
Level 3
Posts: 113
Joined: Sun Mar 23, 2014 10:10 pm
Location: West Cork & London

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by Mick-Cork » Tue Dec 18, 2018 12:20 pm

Hi Xenopeek,

I agree with the bit about not wanting to rely on Google, but from what I can see on the phpBB extensions demo page the two search options can co-exist : https://phpbb.hifikabin.me.uk/.

The free version of GCS does throw up ads, but I guess LM could join Adsense and maybe make a little bit of revenue from it. Wouldn't create any millionaires though! :)

Anyway, not necessarily promoting the idea, just for clarification.

User avatar
xenopeek
Level 24
Level 24
Posts: 24193
Joined: Wed Jul 06, 2011 3:58 am
Location: The Netherlands

Re: Web Site Enhancement: Reduce sensitivity of "common words" filter

Post by xenopeek » Tue Dec 18, 2018 12:24 pm

Ah, that's with it put in the navigation bar. I'm not too sure having two different search fields will make a lot of sense.
Image

Post Reply

Return to “Suggestions & New Ideas”