Posterous theme by Cory Watilo

Streaming Youtube to MP3 Audio in NodeJS

Recently I learned how to stream a Youtube video's audio in MP3 format using nodejs. I wanted to publish my findings and show off how to accomplish this.

First, let's build this the non streaming way, and then I will show you how to upgrade to a streaming method. You must install the powerful ffmpeg via your package management system of choice in order to play along.

I'll be using nodejs, to do this, because it offers us a very easy way to stream bits around. Despite the recent controversy, I'm still quite partial to coffee-script due to it's terse syntax and clean output, so I'm going to use that for most of the new code. Let's also use the wonderful expressjs framework to get us started really quickly. This will set up our project skeleton.

express youtube_audio_streamer

cd youtube_audio_streamer && npm install

You can now convert the skeleton into coffee-script if you like, but I leave this as an exercise to the reader. For the sake of time I simply spawn another shell and do the following in it:

mkdir routes/coffee

cat > routes/coffee/index.coffee <<SCRIPT

exports.index = (req, res) ->

  res.render('index', { title: 'Express' })

SCRIPT

coffee -c -w -o routes routes/coffee/*

Then in yet another shell we can:

node app.js

And we're rocking the "hello world!" page.

Now, we have to add another dependancy to the project. It turns out that getting a good, uncompressed FLV that ffmpeg can digest into something we can use is a giant pain. There's cookies that need setting, HTML that needs parsing, and language detection to accomplish. Luckily for us, misery loves company and the hard work of turning a normal youtube video url ( http://www.youtube.com/watch?v=:youtube_video_id ) into a ffmpeg edible FLV can be done by youtube-dl. Requires python, but otherwise dependancy free. This means that we can focus on building something quickly. Assuming you've got a sane python accessible on your system, install it to the root of the project like so:

curl -s -o youtube-dl https://raw.github.com/rg3/youtube-dl/2011.12.18/youtube-dl && chmod +x youtube-dl

Let's use it from the command line and see what how it works. I've plucked a random short video clip that's holiday themed for the demonstration. Remember, we just want the audio, and we want it in MP3.

./youtube-dl --extract-audio --audio-format mp3 http://www.youtube.com/watch\?v\=mayCvk2P4f0 

[youtube] Setting language

[youtube] mayCvk2P4f0: Downloading video webpage

[youtube] mayCvk2P4f0: Downloading video info webpage

[youtube] mayCvk2P4f0: Extracting video information

[download] Destination: mayCvk2P4f0.flv

[download] 100.0% of 124.19k at  988.27k/s ETA 00:00 

[ffmpeg] Destination: mayCvk2P4f0.mp3

Playing the resulting mayCvk2P4f0.mp3 results in Arnald's voice demanding us to release the cookie. Excellent! Let's make an API for this!

The only variable in all of this is the alphanumeric video id at end of the youtube video URL. I'm going to call this the :youtube_video_id from now on. Since we'll be using a child process to launch youtube-dl, and we're going to need to write the resulant file back to the browser, we'll need the native nodejs modules 'child_process' and 'fs'. Knowing this, let's add the following code to routes/coffee/index.coffee

 

exports.youtube_mp3 = (req, res) ->

  # Spawn a child process to obtain FLV and use ffmpeg to convert it.

  youtube_dl = spawn './youtube-dl', ['--extract-audio', '--audio-format', 'mp3', "http://www.youtube.com/watch?v=#{req.params.youtube_video_id}"]

  # Let's echo the output of the child to see what's going on

  youtube_dl.stdout.on 'data', (data) ->

    console.log data.toString()

  # Incase something bad happens, we should write that out too.

  youtube_dl.stderr.on 'data', (data) ->

    process.stderr.write data

  # when we're done, let's send back the output

  youtube_dl.on 'exit', ->

    readFile "./#{req.params.youtube_video_id}.mp3", (err, data) ->

      # We set our content type so consumers of our API know what they are getting

      res.send data, { 'Content-Type': 'audio/mpeg3' }

 

 

Let's add this to our routes in app.js with this line:

app.get('/youtube_mp3/:youtube_video_id', routes.youtube_mp3);

Now restart our development server and go to http://localhost:3000/youtube_mp3/mayCvk2P4f0 to see the following in the log. It will probably differ slightly, and as you can see, the internet connection I'm on is terribly slow. If you have something in your browser capable of playing mp3's, you should hear audio.

 

Express server listening on port 3000 in development mode

[youtube] Setting language

 

[youtube] mayCvk2P4f0: Downloading video webpage

 

[youtube] mayCvk2P4f0: Downloading video info webpage

 

[youtube] mayCvk2P4f0: Extracting video information

 

[download] Destination: mayCvk2P4f0.flv

 

[download]   5.6% of 124.19k at   93.16k/s ETA 00:01 

[download]  12.1% of 124.19k at  195.03k/s ETA 00:00

[download]  25.0% of 124.19k at  201.17k/s ETA 00:00

[download]  50.7% of 124.19k at  387.09k/s ETA 00:00

[download] 100.0% of 124.19k at  530.81k/s ETA 00:00

 

 

[ffmpeg] Destination: mayCvk2P4f0.mp3

 

 

So, what just happened? We take the id of a video, and feed it to youtube-dl, which finds us an appropriate FLV file. This is then fed into ffmpeg, which outputs this to a static mp3 file, which we then open and read. We should probably upgrade this to cleanly delete the mp3 when it's done, but it will be better to do this as streaming instead, then not only will we have no intermediary file to worry about, it will be a better user experiance.

To see what I mean about a better user experiance, instead of using a short video, let's try a long one. I'm going to demonstrate this with a 8:44 long video of New Years 2011 in Times Square, . Go to http://localhost:3000/youtube_mp3/GKpRXswgDwU. If you have a fast internet connection, this video may download fully and play before your browser times out, but I doubt you'll be impressed with the performance of our first attempted solution. I won't paste the log here, but you'll see the video is ~133mb! Way too much data to download all at once and expect reasonable performance. Now, youtube-dl has some options about setting the max quality and since it defaults to the highest availible we could probably do well to turn that down. In fact, if you do end up using this code in production, I would suggest you do that anyway if only to lower the bandwidth bill.

In order to make this work with larger videos, we can't just download the whole video up front. We need to be able to simultaneously stream the video from youtube into ffmpeg, and the output of ffmpeg, directly into the response.

Let's step away from the code and get something working in the shell first. To solve the first problem of streaming the FLV into ffmpeg, we can get ffmpeg to take input from stdin, and use a unix pipe to stream in the data outputed from GET'ing the URL of the FLV. We're going to need to get the URL that's used internally by youtube-dl. Luck us, there's a combination of options that outputs this for us in `./youtube-dl --help`. Here's the incantation to get the FLV from a youtube video URL without downloading any video:

./youtube-dl --simulate --get-url http://www.youtube.com/watch\?v\=:youtube_video_id

Now with the URL, we can write the new streaming code by first retriveing that URL by using a youtube-dl child process. Then we can GET the FLV, as we recieve the file from youtube, we can pipe the data directly into a ffmpeg subprocess. With ffmpeg set to output the data on stdout, we can pipe this data directly to the response. 

The two subprocess streaming to response setup can be done using pure nodejs, but why do that when we have the excellent request library at our disposal. As you will see, this library makes the code a lot shorter and easier to read. Add this as a dependancy to the project in the package.json file, and then `npm install` it.Once that's done, let's replace the whole file of routes/coffee/index.coffee with this:

 

{spawn, exec} = require 'child_process'

request = require 'request'

 

exports.index = (req, res) ->

  res.render('index', { title: 'Express' })

 

exports.youtube_mp3 = (req, res) ->

  # Spawn a child process to obtain the URL to the FLV

  youtube_dl_url_child = exec "./youtube-dl --simulate --get-url http://www.youtube.com/watch?v=#{req.params.youtube_video_id}", (err, stdout, stderr) ->

      # Converting the buffer to a string is a little costly so let's do it upfront

      youtube_dl_url = stdout.toString()

      # there's a trailing '\n' returned from youtube-dl, let's cut it off

      youtube_dl_url = youtube_dl_url.substring(0, youtube_dl_url.length-1)

      # Before we write the output, ensure that we're sending it back with the proper content type

      res.contentType = 'audio/mpeg3'

      # Create an ffmpeg process to feed the video to.

      ffmpeg_child = spawn "ffmpeg", ['-i', 'pipe:0', '-acodec', 'libmp3lame', '-f', 'mp3', '-']

      # Setting up the output pipe before we set up the input pipe ensures we don't loose any data.

      ffmpeg_child.stdout.pipe(res)

      # GET the FLV, pipe the response's body to our ffmpeg process.

      request({url: youtube_dl_url, headers: {'Youtubedl-no-compression': 'True'}}).pipe(ffmpeg_child.stdin)

 

There are a couple of bits of "magic" that need to be covered in the above code. Digging around in the source of youtube-dl, you'll see the request to get the FLV has the header 'Youtubedl-no-compression' set to 'True'. I mimic this behavior in order to ensure ffmpeg get's uncompressed FLV data from youtube, as ffmpeg does not support compressed FLVs. A possible upgrade later is to dechipher the FLV compression and decompress this on the fly to ffmpeg, so we can use less bandwidth downloading the video. The other bit of magic are the command line args we're passing to ffmpeg. They are a combination of the same args used by youtube-dl to produce an mp3, as well as telling ffmpeg to take data in from stdin (the '-i', and 'pipe:0') and output to stdout (the '-') We're also using LAME to re-encode the audio to mp3, you may need to adjust this to an encoder availible on your system, or install LAME if it isn't installed.

A few other comments, I dropped the dependency on 'fs' as we aren't reading a file anymore, and I use 'child_process.exec' for getting the URL, as the output is short and we fully depend on the completion of this process before moving on. I use 'child_process.spawn' for creating the ffmpeg child process because it allows access to the stdin and stdout streams. There's also lots of things that can go wrong that I'm not checking for, but this is a proof of concept code sample anyway, so use at your own risk.

After restarting your development server, go to http://localhost:3000/youtube_mp3/GKpRXswgDwU. What once either timed out or sat for a long time no longer does! Provided everything went OK, we get the expected behavior of streaming the audio through the whole process.

A note about scaling: this setup would be expensive to scale due to the re-encoding which is CPU bound, and with the code above you're limited to a single machine. To scale this further, I would useZeroMQ. On each request for a youtube video to be encoded as audio, use a ZMQ_PUSH socket to give a FLV URL worker a :youtube_video_id to lookup to a FLV URL. This worker could then ZMQ_PUSH to another set of workers that download the FLV and re-encode the audio to mp3, while publishing with ZMQ_PUB a response_id and a chunk of data to write to it. This setup will scale horizontally very well, additional horsepower can be added very easily to the re-encoding layer. Alternate scaling plans could be to use Ha-Proxy to round-robin the full HTTP requests to different machines running the code here, but this would need to be separate from the rest of the web application's code.

I hope that the exercise was as educational for you as it was for me. Some other things to think about, how do we enchance this to accept "Range" requests so we can use the streamed audio with jPlayer? We'll need to have better error handling to use this in production, where are some good places to add error handling and what's worthwhile to validate? Is there anyway we can decompress the FLVs on the fly and then stream this output to ffmpeg, to save bandwidth? What's a good caching strategy for looking up the URLs of the FLVs from youtube? Is it worthwhile to move the URL retrieval process into JS so we can do this in nodejs?

Caching in SilverLight 3

To sum it up, it looks like the touted GPU acceleration features of silverlight 3 aren't as big as the marketing hype, but they are pretty useful. Looks like it effectively prevents re-renders of static elements, but still allows for transforms. I'll have to play around with this to see how it effects sprites in games.

http://pagebrooks.com/archive/2009/03/31/bitmap-caching-in-silverlight-3.aspx

Fixed Point Math in C#

Click here to download:
FInt.cs (13 KB)

Fixed point math is an interesting optimization for games, and it also has the feature of determinism, something that floating point implementations lack due to rounding, truncation, and hardware differences.

With determinism, a networked physics simulation can guarantee that every machine can perform the same action, and given the exact same starting conditions, produce the same result. This reduces the network overhead significantly, instead of being forced to check the same result was achieved across all computers participating, one can instead make sure that each action was completed in the correct order instead.

Of course, cheating may force checks on the results anyway, but that's a different problem.

The attached file is created from information based here: http://stackoverflow.com/questions/605124/fixed-point-math-in-c

More fixed point functions can be found written in Java here: http://home.comcast.net/~ohommes/MathFP/

Part time game development in a vacuum

Andrew came over last night as well, and he, Gib and I, played some smash and chatted. He asked, "So, how's the game coming?"

"It's on schedule. Tech demo out at the end of the month showcasing the terrain deformation."

"Ah, so is it playable?"

"Well, it's a tech demo. It's just showing off the terrain deformation I've been working on..."

"So, is it playable? With other people?"

A cold reminder about what other people are looking for in a game.

I'm a month in on core technology that will greatly determine the next steps of the game. Animation of the characters in the game requires that I have movement smoothed at least reasonably nicely so it will play nicely, the movement being smooth requires extensive testing on the ground that the characters will be walking on, something that until now, went undefined. Filled terrain, with textures, requires the destruction system working properly so I know what I need to know to draw what when and where. I've allocated myself 2 months, double the amount of time to get this far, towards networking code (integration engine physics are notorisly difficult to get networked because of latency), getting the game online. At least a month on a full featured camera, with split screen functionality. I haven't even started projecting time for asset creation (graphics/animations, music, sounds), beautification (does the grass sway in the breeze? do bushes wraggle as you walk past them? do you make a thud when you wail into something? footsteps? every detail that needs to be present for the game world to feel alive.), the list goes on.

What's the gameplay like? What are the actual mechanics? I can close my eyes and see it. Play it even. Swinging through a flat world of a detailed landscape, like tarzan but you can place the vines anywhere with the grappling hooks. Building momentum, fine tuning the turns to locate your opponent, any advantage to get a jump on them. But they swing from behind, firing a volley of missiles that wear and tear on the cover, and grabbable ledges, that are the natural defenses of the place. Firing back and taking some damage, but not before wounding your foe. Explosions, smoke, chaos, a direct hit shakes the screen, disorienting. Using the remaining momentum from the fight, propelling away from the engangement, your opponent silently agreeing to end the encounter by doing the same, each of you to lick your wounds before the next encounter. Will they leave you enough time to recover? Did you wound them enough to gain the upper hand? Was thier intent to lower weaken your position by reducing your hiding spots to rubble? Will you survive the next encounter? How do you gain the upper hand?

But what's being released at the end of the month is far from the bright vision behind my eyelids. It's a hollow graphic just better then a stick figure that shoots a small back rectangle with a circle on the end of it. When the rectangle hits the large hollow rectangles that the hollow graphic stands on, a circle is cut out from them.

It's a step. Progress. I tell myself, shaking my head at Andrew's short sightedness. Rome wasn't built in a day. Or a month. I suppose it's a human condition to look at something that appears simple and silently think to one's self, "psssht, I can do that," despite the underlying complexities. Feats of dexterity are more protected from this sort of thing for the most part, but many intellectual and artistic endeavors suffer from this perception. People read a book and go, "It's words on a page, look, I can do that." They see lots of books out there already, and say, "See how easy it is? Everyone is writing books. It can't be that hard to do that." Movies, music, collage, everything. If there's a wide variety already out there, and people that can do it aren't PHDs, then most people somehow get it into their heads that anyone can do it, and fairly quickly.

Or maybe I'm one of those people. Something I read in a comment on Slashdot about independent developers goes something like this: "The indie developer starts with an amazing game idea that redefines the genre. He spends 500 hours toiling at the engine, making a smooth, flexible, powerful design that can compete toe to toe with the million dollar engines for sale. His algorithms are perfect, and put industry experts to shame. He begins to put some basic models into the game, realizing that the publicly free and available models look terrible, He goes through every Blender tutorial in a month, and spends another 500 hours working on animations and models. When he realizes that the best his artistic talent produces is a sphere with a gray splotch that loosely resembles a bean bag, he has to slightly retool his original design. The game that could easily have been a million times better then Grand Theft Auto 5 is now Grand Theft Bean Bag. After spending another 500 hours on motion capture throwing a bean bag against a wooden plank, he releases his game, where it promptly falls of the face of the earth because it's ugly as sin and most people don't know what a bean bag is anymore, and those that do don't even like them. After the disappointment, he recovers enough to apply for a job at EA, where he spends a couple of years earning enough money to finally move out of his parents basement, but quits after burning out spending 80 hours a week to write the next NFL2k game."

Is this my fate? Am I destined to toil endlessly in obscurity, before compromising on my vision due to my own limitations and then I get to watch it all fall to dust? Perhaps, Perhaps not. It certainly goes to show that the engine is really nothing the player of the game is concerned about.

Regardless, I am really rethinking the idea of sending out the tech demo. On one side, it demonstrates progress. On the other, it sets up more preconceptions about my work, disappoints my audience. I am still going to build it, prepare it, make it, the learning I can't afford to miss, and the habit of a monthly build will give me a milestone to compare to next month. But it makes me ask the question, is it worth sending out to everyone if the value of it is only to me? The game is right now only truly loved by it's maker, perhaps it belongs with me along until it's face is pretty enough to be viewed by the general public.