Comparing text files from two different directories in Python

Questions about the project and the distribution - obviously no support questions here please
Forum rules
Do not post support questions here. Before you post read the forum rules. Topics in this forum are automatically closed 6 months after creation.
Locked
johnjosef

Comparing text files from two different directories in Python

Post by johnjosef »

Suppose there are two directories Dir1 and Dir2 containing n text files each say t1.txt, t2.txt, t3.txt, t4.txt, t5.txt ...... and tt1.txt, tt2.txt, tt3.txt, tt4.txt, tt5.txt......

I want to run a for loop such that each text file of Dir1 is compared with Dir2 and returns the matching text file.

Using Python
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 2 times in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
vimes666
Level 6
Level 6
Posts: 1193
Joined: Tue Jan 19, 2016 6:08 pm

Re: Comparing text files from two different directories in Python

Post by vimes666 »

I am not familiar with python. However if it is possible to use linux commands in python maybe this one will help:

Code: Select all

grep -Fxf <(ls dir-a) <(ls dir-b)
If you think the issue is solved, edit your original post and add the word solved to the title.
User avatar
Flemur
Level 20
Level 20
Posts: 10097
Joined: Mon Aug 20, 2012 9:41 pm
Location: Potemkin Village

Re: Comparing text files from two different directories in Python

Post by Flemur »

johnjosef wrote: Fri Dec 25, 2020 3:38 am I want to run a for loop such that each text file of Dir1 is compared with Dir2 and returns the matching text file.
Does "matching" mean e.g. t1.txt matches tt1.txt, or the content of t1.txt is the same as the content of tt4.txt?
Please edit your original post title to include [SOLVED] if/when it is solved!
Your data and OS are backed up....right?
Petermint
Level 9
Level 9
Posts: 2981
Joined: Tue Feb 16, 2016 3:12 am

Re: Comparing text files from two different directories in Python

Post by Petermint »

File size? Quantities?

For small files, you could read every file into memory as strings then compare strings.

If you have big directories or big files, there are numerous tricks to make the comparison faster. Things like matching lengths first. Loop through directory A. For each A, loop through B looking for files with the same length. Perform a more detailed compare when lengths match.

The more detailed compare could be creating an MD5 value or similar. You can run a system utility as a task.
HBaguette

Re: Comparing text files from two different directories in Python

Post by HBaguette »

If the files aren't very big, you could just open both of them and iterate through them, comparing each individual character. This would likely be much slower than other ways, though, and more resource-heavy, but I'm pretty exhausted right now and can't think of much better at the moment.

First, like other people have suggested, I'd compare the file lengths and sizes, to see if it's even worth iterating in the first place. If they both match, then iterate through each individual character, and the second it finds any differences, stop comparing them (as they obviously don't match).
User avatar
Termy
Level 12
Level 12
Posts: 4254
Joined: Mon Sep 04, 2017 8:49 pm
Location: UK
Contact:

Re: Comparing text files from two different directories in Python

Post by Termy »

In Perl, I'd first check file sizes, in bytes, then if they're the same, I'd use Digest::MD5, Digest::SHA, or some equivalent. I'm sure there's a similar process in Python.
I'm also Terminalforlife on GitHub.
JosephM
Level 6
Level 6
Posts: 1458
Joined: Sun May 26, 2013 6:25 pm

Re: Comparing text files from two different directories in Python

Post by JosephM »

Well the OP never responded but this looks a lot like someone's homework problem ;)
When I give opinions, they are my own. Not necessarily those of any other Linux Mint developer or the Linux Mint project as a whole.
User avatar
TheyLive
Level 4
Level 4
Posts: 290
Joined: Wed Jun 03, 2020 1:47 pm
Location: Russia

Re: Comparing text files from two different directories in Python

Post by TheyLive »

johnjosef wrote: Fri Dec 25, 2020 3:38 am I want to run a for loop such that each text file of Dir1 is compared with Dir2 and returns the matching text file.

Using Python
HashDeep
It is a police investigation tool. It will find not only text matches, but any.
You can make a wrapper for it in python.
>>>>> Goodly Mint <<<<< Only browser addon for this forum
Locked

Return to “Non-technical Questions”