Forums before death by AOL, social media and spammers... "We can't have nice things"
|    alt.msdos.batch    |    Fun with MS-DOS batch files    |    42,547 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 42,163 of 42,547    |
|    Paul to Maxmillian    |
|    Re: Lower case and diff two text files c    |
|    23 Mar 23 15:31:20    |
   
   XPost: alt.comp.os.windows-10, alt.comp.microsoft.windows   
   From: nospam@needed.invalid   
      
   On 3/23/2023 2:31 PM, Maxmillian wrote:   
   > I have two long lists of email addresses in Windows 10 as text files.   
   >   
   > How can I lowercase everything and then get a diff of what email   
   > addresses are in one text file but not in the other text file?   
   >   
      
   **************************** diffemail.awk **************************   
      
   # Assumes file1.txt and file2.txt are in the current working directory   
   #   
   # gawk.exe -f diffemail.awk file2.txt   
      
   BEGIN {   
    while ( (getline < "file1.txt") > 0 ) { # load one file into memory   
    # I am too lazy to pass this as   
   param   
    $0 = tolower($0)   
    arr[$0]++ # The array index is the key, array content currently   
    # is a don't care condition. You can detect duplicates   
    # if you want.   
    }   
    close("file1.txt") # Polite are we...   
   }   
      
   { # program body, checks for file2 entry is in file1. We are reading file2   
   now...   
    $0 = tolower($0)   
    if ($0 in arr) { # check if a single, incoming entry, is in arr[] or not   
    print $0 " is in both files"   
    } else {   
    print $0 " is not in file1.txt"   
    }   
   }   
      
   **************************** END diffemail.awk **************************   
      
   file1.txt   
   fOo@computer.com   
   baR@computer.com   
   bAz@computer.com   
      
   file2.txt   
   foO@computer.com   
   Bar@computer.com   
   Baz@computer.com   
   not@in.computer   
      
   Output   
      
   PS D:\> .\gawk.exe -f diffemail.awk file2.txt   
   foo@computer.com is in both files   
   bar@computer.com is in both files   
   baz@computer.com is in both files   
   not@in.computer is not in file1.txt   
   PS D:\>   
      
   You can spice up the program with as much if-then-else   
   that you care to. You can even store both files in memory   
   if you want.   
      
   *******   
      
   The gawk.exe file is in the binaries ZIP file here:   
      
   https://gnuwin32.sourceforge.net/packages/gawk.htm   
      
    Binaries Zip 1,448,542 10 February 2008 f875bfac137f5d2   
   b38dd9fdc9408b5a   
      
    Name: gawk-3.1.6-1-bin.zip   
    Size: 1448542 bytes (1414 KiB)   
    SHA1: BDA507655EB3D15059D8A55A0DAF6D697A15F632   
      
   Program uses Windows line endings, whereas the bash shell version   
   would use Linux line endings.   
      
   Program does not support unicode or the like. It is   
   just for plain ASCII at the moment.   
      
   It's not really a practical program, just a demo of   
   how easy it is to whip something up.   
      
   And every language... has something it is not good at.   
   This language is not an exception to that.   
      
    Paul   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca