Comparing text files from two different directories in Python

Questions about the project and the distribution - obviously no support questions here please
Post Reply
johnjosef
Level 1
Level 1
Posts: 1
Joined: Thu Dec 24, 2020 2:21 am

Comparing text files from two different directories in Python

Post by johnjosef »

Suppose there are two directories Dir1 and Dir2 containing n text files each say t1.txt, t2.txt, t3.txt, t4.txt, t5.txt ...... and tt1.txt, tt2.txt, tt3.txt, tt4.txt, tt5.txt......

I want to run a for loop such that each text file of Dir1 is compared with Dir2 and returns the matching text file.

Using Python
Last edited by xenopeek on Fri Dec 25, 2020 3:55 am, edited 1 time in total.
Reason: commercial link removed
vimes666
Level 4
Level 4
Posts: 402
Joined: Tue Jan 19, 2016 6:08 pm

Re: Comparing text files from two different directories in Python

Post by vimes666 »

I am not familiar with python. However if it is possible to use linux commands in python maybe this one will help:

Code: Select all

grep -Fxf <(ls dir-a) <(ls dir-b)
If you think the issue is solved, edit your original post and add the word solved to the title.
User avatar
Flemur
Level 19
Level 19
Posts: 9178
Joined: Mon Aug 20, 2012 9:41 pm
Location: Potemkin Village

Re: Comparing text files from two different directories in Python

Post by Flemur »

johnjosef wrote:
Fri Dec 25, 2020 3:38 am
I want to run a for loop such that each text file of Dir1 is compared with Dir2 and returns the matching text file.
Does "matching" mean e.g. t1.txt matches tt1.txt, or the content of t1.txt is the same as the content of tt4.txt?
Please edit your original post title to include [SOLVED] if/when it is solved!
Your data and OS are backed up....right?
Petermint
Level 6
Level 6
Posts: 1335
Joined: Tue Feb 16, 2016 3:12 am

Re: Comparing text files from two different directories in Python

Post by Petermint »

File size? Quantities?

For small files, you could read every file into memory as strings then compare strings.

If you have big directories or big files, there are numerous tricks to make the comparison faster. Things like matching lengths first. Loop through directory A. For each A, loop through B looking for files with the same length. Perform a more detailed compare when lengths match.

The more detailed compare could be creating an MD5 value or similar. You can run a system utility as a task.
HBaguette
Level 1
Level 1
Posts: 11
Joined: Wed Dec 30, 2020 10:49 pm

Re: Comparing text files from two different directories in Python

Post by HBaguette »

If the files aren't very big, you could just open both of them and iterate through them, comparing each individual character. This would likely be much slower than other ways, though, and more resource-heavy, but I'm pretty exhausted right now and can't think of much better at the moment.

First, like other people have suggested, I'd compare the file lengths and sizes, to see if it's even worth iterating in the first place. If they both match, then iterate through each individual character, and the second it finds any differences, stop comparing them (as they obviously don't match).
User avatar
Termy
Level 6
Level 6
Posts: 1444
Joined: Mon Sep 04, 2017 8:49 pm
Location: UK
Contact:

Re: Comparing text files from two different directories in Python

Post by Termy »

In Perl, I'd first check file sizes, in bytes, then if they're the same, I'd use Digest::MD5, Digest::SHA, or some equivalent. I'm sure there's a similar process in Python.
I use Linux Mint 18.3 with Cinnamon in a VirtualBox VM for testing & sandboxing.

I'm LearnLinux (LL) on YouTube: https://www.youtube.com/c/learnlinux
I'm also terminalforlife (TFL) on GitHub: https://github.com/terminalforlife
JosephM
Level 6
Level 6
Posts: 1229
Joined: Sun May 26, 2013 6:25 pm

Re: Comparing text files from two different directories in Python

Post by JosephM »

Well the OP never responded but this looks a lot like someone's homework problem ;)
When I give opinions, they are my own. Not necessarily those of any other Linux Mint developer or the Linux Mint project as a whole.
TheyLive
Level 2
Level 2
Posts: 59
Joined: Wed Jun 03, 2020 1:47 pm
Contact:

Re: Comparing text files from two different directories in Python

Post by TheyLive »

johnjosef wrote:
Fri Dec 25, 2020 3:38 am
I want to run a for loop such that each text file of Dir1 is compared with Dir2 and returns the matching text file.

Using Python
HashDeep
It is a police investigation tool. It will find not only text matches, but any.
You can make a wrapper for it in python.
Goodly Mint - Firefox addon for this forum
viewtopic.php?f=211&t=340039
Post Reply

Return to “Non-technical Questions”