Announcing ARM support
Thanks to our Google Summer of Code student David Conrad (aka Yuvi), we now have ARM support in x264, along with a significant amount of SIMD acceleration via NEON, available on the Cortex A8 and A9 chips. Yes, that’s right, x264 can now run on an iPhone. Total performance increase from the NEON optimizations (so far) is about 280% on default settings.
With low power becoming more important and ARM chips increasing in speed dramatically (multi-core chips are already hitting silicon), being able to do high quality, high-speed realtime video encoding on ARM chips will become more and more important. Staying ahead of the game as always, x264 will be the premiere encoder on ARM as well.
One situation showing the usefulness of low-power encoding was brought up a month or two ago: a remote-control airplane enthusiast wanted to make his airplane broadcast camera footage over the cell network so that he can remote control it many miles away from his current location. The cell network is generally low bandwidth, so he needs a high-efficiency video encoder. But he can’t afford a powerful system; his airplane is already extremely low power and he needs an encoder that is both low-power and low-weight. The ARM chip is perfect: it uses a fraction of a watt, almost no space, and now, he can run x264 on it.
Special thanks to Mans Rullgard for helping with lots of assembly questions and contributing the NEON deblocking code, originally used in the ffmpeg H.264 decoder.
Want to play with x264 on an ARM? Get a Beagleboard.
August 24th, 2009 at 11:25 pm
Another option for playing with ARM at home is SheevaPlug, which has Ethernet instead of USB/audio/video.
August 26th, 2009 at 1:54 am
^ I somehow doubt SheevaPlug support for NEON as it comes from Marvell ( ex intel)
August 26th, 2009 at 11:55 am
while it appears the Marvell® 88F6281 SoC with Sheeva™ based on kirkwood does not have the SIMD NEON capabilitys.
there may be versions out there with it included by 3rd partys OC now or in time if you send some time researching it.
this SheevaPlug Does come with some impressive capabilitys of its own that we AV x264 people might make use use.
no least the
Audio and MPEG Transport Stream Interface
http://www.marvell.com/files/products/embedded_processors/kirkwood/88F6281-004_ver1.pdf
the TI and ARM Cortex optimization speed imprivements on page 11 of this seems impressive and its better today OC as times passed and clock speeds and 3rd partys adding other Vidio/Audio IP to this core block, lego style plugin SOC blocks have increased since then.
http://www.arm.com/miscPDFs/23881.pdf
and you can find some of the developer info here
http://www.plugcomputer.org/index.php/us/component/search/SheevaPlug?ordering=&searchphrase=all
SheevaPlug Development Kit README-Rev1.2
x264 might be multithreading on the available cores, but its a shame theres no way to multi process some parts of an Encode on these or other gigE devices/plugs as a generic option, one day perhaps for fun, someone might try and find a good way sometime.
September 1st, 2009 at 1:22 am
Support for iWMMXt would be pretty sweet.
September 1st, 2009 at 11:47 pm
@Chris
iWMMXt is basically deprecated at this point, so there’s no real point, especially since most ARM chips that support it are way too slow for video encoding anyways…
September 20th, 2009 at 11:40 am
Is the an ARMv5 version in the works?
September 20th, 2009 at 3:07 pm
@Chris
Probably not, the ARM5 is way too slow to do serious video encoding.
September 20th, 2009 at 3:41 pm
what would it take to port the current armv6 asm code? (to get working on arm5) my app is low frame rate, so it’s ok if it doesn’t perform great.
September 23rd, 2009 at 1:09 am
armv5 has no simd, so there is no significant advantage to be gained from asm: the only advantage would be fixing compiler stupidities.
The existing v6/neon asm functions don’t really help in writing v5 asm; you’d have to rewrite most everything since the key simd instructions aren’t there.
October 19th, 2009 at 11:51 pm
Hi,
I was wondering if you could give me pointers on where/how to get started in compiling and using baseline x264 for a very slow ARM7TDMI. I know it is a bad MCU option for video. I am trying to do this for a purely experimental purpose as well as research purposes. I intend to play with CIF image sizes.
Thanks a lot for your time,
Ad.
October 20th, 2009 at 12:47 am
@Ad
If you’re looking for assistance, drop by IRC (#x264 or #x264dev on Freenode). Talk to Yuvi, he can likely help you with ARM-related issues.
October 26th, 2009 at 4:28 am
Hi!
I read about the performance boost and that 280% is enticing, but can we get some actual numbers? Like what frame rates can we expect on different frame sizes…I have no way right now to try it for myslef, but I would really like to know this!
Thanks!
October 26th, 2009 at 11:06 am
@CocoBongo
You can get roughly ~32fps on absolute fastest settings (constant QP, –preset ultrafast) with CIF resolution video on a 500mhz Cortex A8.
The A9 is a lot faster and the clock speed will rise as well, so VGA encoding at ~15fps isn’t out of the question in the near future.
February 16th, 2010 at 1:18 am
http://www.dailywireless.org/2010/02/15/mwc-2010-really-big-show/
has some news on ARM Cortex-A9 MPCore 1.2GHz dual chips coming to a mobile phone soon.
ST-Ericsson’s U8500 platform
http://www.businesswire.com/portal/site/home/permalink/?ndmViewId=news_view&newsId=20100215005149&newsLang=en
not to sure about the Mali-400™ graphic processor capability’s as regards high profile though!
anyone got one to try and report its HP@L4.0 abilitys etc.
February 16th, 2010 at 1:39 am
in other news people might find a good use for this too while on the move.
http://www.theregister.co.uk/2010/02/15/wi_fi_sim/
a wifi SIM installed in your A9 cluster of mobile phones and PMP encoding an x264 job or two , just a clustered x264 patch or two away in the future perhaps
March 17th, 2010 at 9:57 pm
The 280% improvement from Neon is impressive. This is a relative performance number. Do you have any absolute performance numbers for given A8/A9-based hardware platforms?
October 1st, 2010 at 8:37 pm
hmm, i wonder if RIM will provide for free some playbook’s to x264 Arm dev’s before their retail release next quarter ?
http://www.engadget.com/2010/09/27/rim-introduces-playbook-the-blackberry-tablet/
“a Cortex A9-based, dual-core 1GHz CPU (the company calls it the “fastest tablet ever”
7-inch LCD, 1024 x 600, WSVGA, capacitive touch screen with full multi-touch and gesture support
BlackBerry Tablet OS with support for symmetric multiprocessing
1 GHz dual-core processor
1 GB RAM
Dual HD cameras (3 MP front facing, 5 MP rear facing), supports 1080p HD video recording
Video playback: 1080p HD Video, H.264, MPEG, DivX, WMV
Audio playback: MP3, AAC, WMA
HDMI video output
Wi-Fi – 802.11 a/b/g/n
Bluetooth 2.1 + EDR
Connectors: microHDMI, microUSB, charging contacts
Open, flexible application platform with support for WebKit/HTML-5, Adobe Flash Player 10.1, Adobe Mobile AIR, Adobe Reader, POSIX, OpenGL, Java
Ultra thin and portable:
Measures 5.1″x7.6″x0.4″ (130mm x 193mm x 10mm)
Weighs less than a pound (approximately 0.9 lb or 400g)
RIM intends to also offer 3G and 4G models in the future.”
also is there any benefit for ARM x264 div’ to check for these types of memcpy speed improvements , or doesnt it effect the the x264 ARM codebase ? , on the face of it given the charts he seems to get a lot of extra cycles back in testing
http://projects.powerdeveloper.org/project/imx515/795
see While on the subject of memcpy..
posted by martin krastev on 23rd September 2010 entry.
March 7th, 2011 at 3:29 pm
Hi,
I am working on power consumption of x264 on beagle board. I need an information about Core-A8 utilization and Core-A8 NEON utilization to study power consumption.
Do you have any data about these two utilization when running x264?
Thanks
I highly appreciate if you can email me with your answer.