Thursday, April 15, 2010

Self Plagiarism keeps our country strong

I'm lazy and haven't been bloggerating much lately, so herein I reproduce for you an e-mail chain that makes me sound really smart and cool and also doesn't require any rewriting or work on my part.

From Hedgehog:

WTF due! Now I want to procrastinate and enjoy life via the internets with you and you are not here!

I wish I never split this data into a million little SPSS files and then did stuff with it, I wish I had one master file with all the pieces, but I don't want to have to make that master file now. Variables have different settings, I computed variables on some files but not others, I changed the formats, some are aggregated.

Fucking shit!

This is what happens when your boss wants to keep slicing the data into little pieces to look at just this and just that without a master plan behind the whole thing.

I wish I had a master file to begin with and I wrote syntax to do all that shit, but it was always just one little thing and then one more little thing, and inch by inch I have created chaos.

My reply:

Oh boy are you screwed. This is why SAS or STATA are better, because you can easily save multiple files in all the different kinds of slices you want and then undo it and go back to the original and use "if" statements and all that.

I am in a similar chaos with the [acronym] data, which are created by taking in text files the districts send us and then converting them to Microsoft Access tables and then using queries to create class and school-level files, which are then opened in Microsoft Excel and formulas are entered to compute some variables like class size and then those files are saved as .csv files which are read in to Stata, where we impute missing values and create six more files which are then opened individually in HLM and models are run and then output into folders named informative things like "01" and "23." I really really really kind of hate the guy who is in charge of the data for this right now, and especially for NOT BELIEVING ME that I can do ALL OF THAT in Stata, and yes it will take some time to write the code, but for the love of God it will save us about three days of processing the next time data comes in.

Anyway, you at least have a good excuse.

The long term lesson is that there's nothing you can do now but either take the pain in the ass effort to put it all back together, or hobble on the way things are until you don't have to use the data anymore... it's like trying to swim across the ocean... are you more than halfway across? Because if so, don't turn back just because you're too tired.

That damn metaphor fails me every time. The fact is that if you have swum halfway across the ocean you're probably about to die by shark attack or exposure or just plain being stupid, so I can't help you.

Man, I should convert this into a blog instead. This shit is way too good to be wasted on just an e-mail to you. How will I make it in the blog big time if I keep conversing privately with people??



No comments:

Post a Comment