[If you don’t get the reference, look here.]
A colleague had an odd question. There were more than a thousand little log files, all just text. She needed to scan all of them for problems. If only they were one big file, it would be a snap. But was there a way to combine lots and lots of little text files into one big text file in one or two steps?
I said, “There must be, but I wouldn’t be the one to know how.” [The files are on a Windows box, but could always be moved to a Unix box, I suppose.] Then, she said, “You know, like in Word, if you have a bunch of chapters and you want to combine them all into one big manuscript…” Well, making a multichapter document’s a little more complicated, but I knew that you could actually attach multiple files to a document in a single step, using shift-mouse or control-mouse directory selection.
Hmm. Would Word import a whole bunch of text files in one step just by highlighting them all in a directory? If so, that would be a crude but entirely effective solution.
Sure enough: Works great. I happened to have a directory with 30 little .txt files. Opened a blank Word document, clicked on Insert File, chose .txt in the bottom menu, selected everything in the directory: Five or ten seconds later, I had a combined file. Since they were .txt files, Word wasn’t asking me whether to retain formatting or any of that stuff: It just appended everything. A thousand files might take a minute or two.
Of course, all you high-tech folks already knew this. But I didn’t. And it will save the colleague a couple of hours of inquiry…
“She needed to scan all of them for problems. ”
What was wrong with using any of the gazillion utilities that scan an entire directory?
Choose your DeskTop Search … or Norton …
On a Unix box (which is where most logfiles live anyway, yes?), the cat command will do the trick. cd to the directory, then
cat *.txt > bigfile.txt
Works a treat.
On a windows box the cat command will do the trick! Cygwin is your friend.
But then, I’m an old-school BSD Unix weenie.
Ah, but another question leaps to mind that might be more important. What were the problems in the logs that she was looking for? Were they regular enough that you could grep them out via cygwin? I must admit not being a word power user, but if she was only looking for a certain lines in what will become a massive files is there a good way to do that via Word without inducing carpal tunnel? I know you can search regex in Word via the “Wildcards search”, but does anyone know a quick way to extract those matches without visiting every line?
Hmmm, something to check out later when I get the chance.
But then perhaps I’m just a new-school Linux weenie ;). Probably just went past any reasonable point for the user.
I’ll respond to all four at once:
1. She needs to actually look at the text in each file, and it’s only a few lines per file. The big problem is opening and closing all those files. This quick technique eliminates that.
2. These are log files from MS SQL and related items; they’re all from Windows boxes.
3. If “Cat” will do the same thing on Windows that it does in Unix, that is, concatenate huge numbers of files in a single step,that might be equally straightforward…thanks.
4. I’m guessing that it’s not a case where a regular expression will find the problem. It is a case where she thinks the visual scan won’t take long.
Overall, there’s another issue: For something you only need to do, say, once or once a year, why would you go find and learn a new tool or utility when something you already know will do the job? That’s my attitude on “special keyboard tricks” in most applications as well: Unless it’s something I do at least a few times a month, why would I bother trying to memorize stuff that might save me two keystrokes at a time?
For me, for now, the only keystroke combos that matter are ctrl-c [copy], ctrl-v [paste], ctrl-z [undo, particularly nice for turning “smart quotes” back into inch signs when I’m talking about measurements], and alt-shift-x, Index, when I’m updating the C&I index document. For other people, these are pointless to remember, but other keystroke combos are wonderful. Ditto “when to use right-mouse,” although with Windows apps the answer is almost always “give it a try.”
I think you’ll find that cat doesn’t work in straight Windows. Both me and the other poster were refering to Cygwin (http://www.cygwin.com/) , the Linux emulator.
I agree on the problem of special tools for infrequent jobs. My gut suspicion is that problems like this are more frequent than most would realize, but it’s hard to really have perspective as a “techie”. Of course, the other difficulty is people who even do some task frequently don’t necessarily realize those tools are out there that could make their job easier. Ah well, the infamous catch-22 of the computing world.
I didn’t know about this particular trick. Thanks for sharing!
If you open a command window on your Windows box (Start > Programs > Accessories > Command Prompt or Start > Run… > cmd), you can use an old DOS trick: use the COPY command with multiple files to create a new file that concatenates them together. Start by navigating to the appropriate directory at the command prompt, for example…
cd C:\mydirectory
…and then if your files are named file001.txt, file002.txt, etc., you can issue the command…
copy file*.txt bigfile.txt
…where bigfile.txt is the name of the new, concatenated file. If the files don’t have nice neat sequential filenames like that, just create a new directory to work in and move them all there first (using Explorer or command line, whichever you prefer) — then you can just…
copy *.* bigfile.txt
I believe they are pulled into the concatenated file in ASCII order by filename. Hope that helps somebody.
“For something you only need to do, say, once or once a year, why would you go find and learn a new tool or utility when something you already know will do the job?”
Well, DeskTop Search is generally useful, and basically seems to be the right tool for this job – what if you do need to do it again? What if you need infomation which was lost in the import to Word? What if somone changes the requirements ever so slightly?
My view is certainly colored by my perspective that some of my job is dealing with the above issues.
Wow (she says while smacking her head)… why didn’t I think of that? Like you say, it’s not that I’m not capable of finding, installing, learning and using a new utility… I just would prefer to use what I’ve got if it will work and I only need to do it infrequently.
What if one did need to retain formatting?
Adrienne,
I’m not sure what question you’re asking. This particular situation didn’t involve formatting issues. There are, or at least used to be, ways to combine several Word documents with different formatting into a single master document, retaining the format of each document, but that’s a whole different set of issues…
I have a 300 page doc that’s all in Courier New 11 and I want to add to the end of it a 100 page doc that uses multiple fonts and contains tables and some landscape pages. When I add it, the second doc takes the formatting of the larger doc. I’ve used Insert File and copy and paste, inserted breaks, etc., but it still screws things up. Some of the pages are custom sizes, which is changing the size of the previous page on insert. Won’t let me change the page size back to letter. Thanks.
Sorry; you’d need expert help on that. If you can find Word help on “master documents” and the like, it might help–but that may have changed.