Running U-SQL Advance Analytics Locally Eats My Disk Space

By | June 12, 2017

Disclaimer: I know that what I’m doing is completely unsupported right now. This is just to document what I’ve found just in case someone else has the same issue.

Now that U-SQL supports running R and python scripts, it would be awesome to develop and test U-SQL scripts with R locally.

Luckily, the folks at Microsoft received a lot of feedback asking when local execution would be available so they put together a blog post describing how to do it.

Even though local execution is completely unsupported at the moment, this is something I need now to move some of our nightly work off of Hadoop.

Getting things running was pretty easy. Just follow some of the examples they have.

So there I was developing locally when I encountered an error: Insufficient disk space.

What? I pulled up trusty old windirstat and found a whole bunch of folders in the root of my C: drive (this is where my ADLA local root directory is). They all had 8 random characters as the folder name.

My first thought was, “Shit. I got a virus.” I develop in a virtual machine, so I wasn’t too worried.

Looking in those folders, it looked like an entire copy of an R installation.

And that’s when it hit me.  Every time I ran my script using the R Extensions locally, these folders were being created.

I delete a few of them to free up some space, ran my script again and watched. Yep, a couple more of those directories showed up.

So, if you’re running this locally, just watch your hard drive space 🙂

    matt

      A little of both. I ran into this specific issue during the day, but I probably would’ve run into it at night as well 🙂


