GnuWin32 Trick: Quickly Finding Text in File

June 1st, 2017

The downside of not managing a good library of scripts is forgetting where some code is written.  Case in point: I wrote a nice RMSE script, but I forgot where it was.

So I found it with the following command line:

grep "rmse" $ls -R ./*/*.R ./*/*.r

The first part of the script – grep “rmse” – tells the grep command to look for rmse.  The second part – $ls -R .//.R .//.r – tells what files to look through (that command is list recursive looking for *.R and *.r files).

DOS Commands You Should Know: FINDSTR

February 4th, 2014

The last time I talked about DOS, it was FIND.  Find is great for certain uses, but not for others… like when you need to search for a string through a lot of files in many subfolders.

In my case, I wanted to look for where I’ve used DELIMTER in a Cube script.  I tried Microsoft’s example, and it doesn’t work (and their comment box doesn’t work with Chrome, so there’s that, too).

This is a two step process.  The first is easy, and it uses a very basic DOS command: dir.

dir *.s /a/b >filelist

This creates a list of files to search in the current folder.  The list will include the full path.

The second command is actually three-in-one:

echo off & for /F "tokens=*" %A in (filelist) do findstr /i /m "DELIMITER" "%A"

The first part of this is “echo off”.  This turns off the command prompt every time (else, you’ll see every findstr command).

The second part is the for… do loop.  This basically says “for each line in the file” and stores it (temporarily) as %A.

The third part is the findstr command.  The i switch turns off case sensitivity, and the m switch prints ONLY files that match.  I’m searching for DELIMITER (not case sensitive, of course).  The “%A” is the file to search, being passed along from the for…do loop.  This is in quotes because there are spaces in some of my path names, and without the quotes, the command would fail when a space is encountered because it would think it is the end of input.

This is useful if you’re like me and have 1,563,169 lines of script file in your model folder!

BONUS TIP!

I found the number of lines using gawk wrapped in the same process:

echo off & for /F “tokens=*” %A in (filelist) do gawk ‘END{print NR}’ “%A” >> filelen

This gave me a long list of numbers that I brought into Excel to get the sum.

In the gawk command, ‘END{print NR}’ means to print the number of records (by default, lines) at the end of looking through the file.  “%A” is the file to check (just like in the findstr command).  The >>filelen APPENDS the output to a file called filelen.  It is important to use the append here because the command runs on each loop.  If a single > is used, only the final number of lines is placed in the file.

DOS Commands You Should Know: FIND

November 26th, 2013

Recently, I stumbled upon a problem in my new mode choice and distribution code – I was setting unavailable modes to -9999 to ensure that there was no chance of the model to choose an unavailable mode.  I found later that using that value was a bit extreme and I should be using something like -15 (and the difference causes wild logsum values).

After changing these values in 10 scripts, I wanted to ensure that ALL were changed so I didn’t end up running them and finding that I had to wait another 15 minutes after finding an error (or worse, not immediately finding the error!).

So, I used the FIND command in DOS.

All of my distribution files begin with 25 and end with .S, so I used:

find "=-9999" 25*.S"

Missed a few in these files.  The filename is listed there so I can go to it and fix it.

Missed a few in these files. The filename is listed there so I can go to it and fix it.

Missed a bunch in this file.  This is why I checked :-)

Missed a bunch in this file. This is why I checked 🙂