Is a gaming machine better for software development? [closed] - development-machine

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
Is a gaming machine better for software development?

NO.
CPU
For software development, you need lots of cores. For gaming, you need fast but not necessarily many cores. This is slowly changing as newer games are being written to take advantage of multicore CPUs, but the general case is that most gaming machines focus on raw CPU power. For example, in my case, I'm an RoR developer, and during development I run: my editor, mongrel, solr, postgresql, and memcached. Most of the time I also have an open browser, a PDF editor, and iTunes.
RAM
Most games will be OK with 2-3GB of RAM.
For software development, especially web development - if you will be running multiple servers - you'll want at least 4GB, or even 8GB of RAM.
GPU
Top-of-the-line graphics cards for gaming can cost $500 or more. For software development, you can get away with the cheapest GPU you can get. The only aspect of the video card you'll want to concern yourself with is the capability to handle multiple large monitors.
It will actually be helpful if your development machine is so crippled (gaming-wise) that you can't play the games you like to play on that machine. No distractions! :)

I would say some aspects are the same between gaming machines and development machines, like large disks, a lot of memory, etc. So in that respect yes, a gaming machine would fit better than a low end desktop.
On the other hand, gaming machines tend to be tuned towards raw performance instead of robustness. A development machine often does not need a state of the art graphics card, nor does it want a RAID-0 to spead up the disk. If it crashes one disk you lose all your work, so RAID-1 would be much better. Same holds for memory, ECC (or what its called nowadays) is a bit slower but adds robustness.
One gotcha with powerful development machines is that they do not represent the non-functional requirements as to execution environment. If you are not aware of this enough your software will run slow on a "normal" machine because it ran great on your supercomputer :-) One take on this is that development machines should always be a tad slower than the target machines, but this cuts into your development time. A better solution is to have slower machines in the test environment and a few slower machines in the development lab.

Some attributes of gaming machines can help developers, like having a good deal of memory, or a quad core processor (so you can, respectively, run VMs without hassle, and compile faster).
But a fast GPU won't do you much good, so there's no point in spending much money on it. Unless you plan on developing or playing games, of course.
Summing up: if you plan on using the PC for fun, get a reasonable GPU. If you don't, skip it and keep the rest just like you would. You won't regret it.

If you want to develop games, sure. I should know, I have experience on both.

Unless you're programming something to do with graphics / game related, not necessarily. The video card is going to be underused otherwise. On the other hand gaming machines tend towards the high end making them ideal for many programming tasks.

I think so. I think the performance required for gaming will greatly help developers. Only overkill would be graphics, unless you use big rendering software, in which case RAM, graphics is a must.
Good CPU, Lots of fast RAM, and a fast HD will do you lots of good.

What you'll need for software development is usually a machine with ample RAM, ample HDD space (and a fast HDD or set of HDDs to boot), a fast multi-core processor (very important if you're working with compiled languages, especially the likes of C++ which take a long time to compile compared to Java or C#) and preferably the ability to drive multiple monitors. For the latter, it's a case of the more the merrier as screen real estate is one of those things that you can never have enough of.
While a lot of this does indeed sound like the spec for a gaming machine due to its raw number crunching ability, the main difference is likely to be the graphics hardware. You don't need something that can render x million polygons per second on a single monitor if you're trying to drive 3x 24" monitors as 2D displays. In fact you probably don't want a usually rather noisy gamer spec video card that only shines when rendering 3D; you're more likely to get more out of a "pro" graphics card that can drive 4 monitors instead.
So yes, I'd think the spec is quite similar and there is a lot of overlap between the two but in the end a developer spec machine is not the same as a gaming rig.

A gaming machine without the fancy video card, I think that's more suitable for a programmer. (you can use the video card money to add more RAM for example)

Gaming machines are great for everything except your wallet ;-)

Programming WPF Shader Effects is one of those particular tasks where a gaming machine can actually allow you to do more while not working in game-development. Also, GPGPU work may benefit from fast memory transfer and fast GPU.

Related

how can build single board computer like Raspberry Pi for run OS?

my question is : how can build single board computer like Raspberry Pi for run OS ?
user ARM micro processor and debian arm os , can use USB and etc.
like raspberry pi and other single board computer
i search but find nothing for help me !!! :(
The reason you can find nothing is probably because it is a specialist task undertaken by companies with appropriate resources in terms of expertise, equipment, tools and money.
High-end microprocessors capable of running an OS such as Linux use high-pin-density surface mount packages such as BGA or TQFP, these (especially BGA) require specialist equipment to manufacture and cannot reliably or realistically be assembled by hand. The pin count and density necessitates the use of multi-layer boards, these again require specialist manufacture.
What you would have to do if you wanted your own board, is to design your board, source the components, and then have it manufactured by a contract electronics assembly house. Short runs and one-off's will cost you may times that of just buying a COTS development or application board. It is only cost-effective if you are ultimately manufacturing a product that will sell in high volumes. It is only these volumes that make the RPi so inexpensive (and until recently Chinese manufacture).
Even if you designed and had your own board built, that in itself requires specialist knowledge and skill. The bus speeds on such processors require very specific layout to maintain signal integrity and timing and to avoid EMC problems. The cost of suitable schematic capture and board layout software might also be prohibitive, no doubt there are some reasonably capable open source tools - but you will have to find one that generates output your manufacturer can use to set-up their machinery.
Some lower-end 8 bit microcontrollers with low pin count are suitable for hand soldering or even DIP socketing, using a bread-board or prototyping board, but that is not what you are after.
[Further thoughts added 14 Sep 2012]
This is probably only worth doing if one or more of the following are true:
Your aim is to gain experience in board design, manufacture and bring-up as an academic or career development exercise and you have the necessary financial resources.
You envisage high production volumes where the economies of scale make it less expensive than a COTS board.
You have product requirements for specific features or form-factor not supported by COTS boards.
You have restricted product requirements where a custom board tailored to those and having no redundant features might, in sufficient volumes be cost-effective.
Note that COTS boards come in two types: Application modules intended for integration in a larger system or product, and development boards that tend to have a wide range of peripherals, switches, indicators and connectivity options and often a prototyping area for your own use.
I know this is an old question, but I've been looking into the same thing, possibly for different reasons, and it now comes up at the top of a google search providing more reasons not to ask or even look into it than it provides answers.
For an overview of what it takes to build a linux running board from scratch this link is incredibly useful:
http://hforsten.com/making-embedded-linux-computer.html
It details:
The bare minimum you need in terms of hardware ( ARM processor, NAND flash etc )
The complexities of getting a board designed
The process of programming the new chip on the board to include bootloaders and then pointing them to a linux kernel for the chip to boot.
Whether the OP wishes to pursue every or just some of these challenges, it is useful to know what the challenges are.
And these won't be all of them, adding displays, graphics and other hardware and interfaces is not covered, but this is a start.
Single board computers(SBC) are expected to take more load than normal hobby board and so it has slightly complicated structure in terms of PCB and components. You should be ready to work with BGA packages. Almost all of processors in SBCs are BGA (no DIP/QAFP). Here is the best blogpost that I recently came across. Its very nicely designed and fabricated board running Linux on ARM processor. Author has really done a great job at designing as well as documenting the process. I hope it helps you to understand both hardware and software side of SBCs.
A lot of answers are discouraging. But, I would say you can do it, as I have done it already with imx233. Its not easy, its not a weekend project. My project link is MyIMX233.
It took me about 4-5months
It didn't cost me much, a small fine tip soldering iron is what I used.
The hard part is learning to design PCB.
Next task would be to find a PCB manufacturer with good enough precision, and prototyping price.
Next task would be to source components.
You may not get it right, I got the PCB right by my 3rd iteration. After that I was able to repeatedly produce 3 more boards all of which worked fine.
PCB Design - I used opensource KiCAD. You need to take care in doing impedance matching between RAM and processor buses, and some other high speed buses. I managed to do it in 2 layer board with 5mil/5mil trace space.
Component Sourcing - I got imx233 LQFP once via mouser, and once via element14.
RAM - 64MB tssop.
Soldering - I can say its easy to mess up here, but key is patience. And one caution don't use frying pan and solder past to do reflow soldering. I literally fried my first 2 processors like this. Even hot air soldering by a mobile repair shop was also not good enough.
Boot loading image - I didn't take much chance here, just went with Archlinux image by olimex.
If you want to skip the trouble of circuit designing between RAM & processor, skip imx233 and go for Allwinner V3S. In 2017/2018 this would be easiest approach.
Bottom line is I am a software engineer by profession, and if I can do it, then you can do it.
Why not using an FPGA board?
Something with Zynq like the Zybo board or from Altera like the DE0-Nano SoCKit.
There you already have the ARM core, memory, etc... plus the possibility to add the logic you miss.

Is it feasible to virtualize developer machines? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 months ago.
Improve this question
It's budgeting time and Corporate is balking at the cost of replacing a coworker's machine who is due for it, needs it, and deserves it.
Our group is a small ISV/SAAS that exists as a division of a larger media group. We are not a cost center, we make money, even this year. We are owned by a mid-size media group whose business model is quite different, and seems driven only by reducing costs.
Our software stack is Visual Studio 2008, SQL 2008, on Windows Server 2008 (so that multiple root websites can be hosted and debugged on each dev's machine). Our target hardware is 3GHz quad-core workstation, 4GB RAM, and RAID 1 mirrored hard drives so that we are protected against the productivity loss of losing a developer hard drive.
Corporate wants to give us a couple powerful, but hand-me-down, decommissioned servers, and then each developer would have a virtual workstation on that server. The computers sitting on our desktops would be dumb terminals at $400-500 each.
I'm trying to be neutral but I doubt it's hard to discern my bias. I'd like to see real developer reactions to this, and I figure this is the best place to get that.
Please include arguments for or against, evidence if you've seen this tried and how well (or not) it has gone.
This sounds like a well intentioned idea, but:
In my experience you need multiple cores, lots of memory, and fast disks to be productive in today's modern IDE's. I don't see that happening in a virtual environment with any economy. Individual boxes are still better.
It's also an issue of control. In a virtual environment I can imagine all kinds of restrictions. Will you still be able to install your own tools, for example?
Ultimately, it's misguided. If this idea increases build times by any substantial amount, any savings in hardware will quickly be erased by lost productivity. Conversely, money that is spent on decent individual machines for developers will quickly pay for itself over and over in reduced build times.
Good quality individual machines are an investment, not a cost.
Development is disk-bound, i.e. you spend your time waiting for builds which is a disk-bound process most of the time. If you're all sharing a machine build times will become much worse.
Aside from all of the givens (perfomance, disk space, etc...):
I would be OK with this as long as I still had multiple monitor support.
Without that, it is a no-go.
Basic failure to understand what a developer box is actually doing much of the time:
When building its chewing through processor and disk - especially disk.
When testing you're talking about having one or more instances of Visual Studio running (once you get past two things start to get interesting), database server, website/services plus all the other stuff (browsers with a lot of tabs open, notebook software, and heaven only knows what else) all spread across multiple monitors (at least two). Lots of cores, lots of memory please!
I can quite happily accept that there's an argument for virtualisation - a good dev box should be able to host multiple, concurrent VMs in order to isolate some of the above and to provide "clean" environments for testing. Note that that's the box for ONE developer hosting multiple VMs solely for the benefit of that one developer...
Our team is developing on remote server (no GUI stuff, plain old vim) for quite some time without problems. Granted it requires rather powerful server and sometimes is starts to be bit on a slow side if everyone start to compile at the same time.
But as a bonus you are very mobile in terms where you can develop from (we all are having laptops) be it in office, home, sunny beach (last one was probably overstatement).
Bute yeah, that might not all work well for graphics heavy apps of course.
It sounds like your group is not offering the solutions that you have considered in a well documented format, otherwise corporate would not be shoving decisions down your throat. If you have a documented process for development, corporate might want to discuss changing the process with you, but as soon as you say, "this change would break our process and we would have to retool our development workflow", they will see the pain of the $$ in reworking the process and most likely back off. That said, once your process is documented, you should internally be ruthless about trying to make it more efficient and cost effective, and have an open mind about corporate's suggestions.
I assume you have machines already for SVN / TRAC, your Continuous Integration server, product demos, testing, etc. and that the only possible use your team could make of these servers is for personal VMs.
I do many things that peg my processor at 100%. Compiles certainly achieve this. Now imagine having to share that processor with 10 other developers. The loss in productivity will become quite apparent. If you have a multi-core PC, this won't be as painful. Get an Intel i7 and you probably won't even notice it when 8 people are logged in. Most programs (including my compiler) can't use more than 1 processor anyway.
That said, it's a viable solution to reduce costs. I used to work at a company who has since switched to these dumb terminals. It works fine. My university had HP UNIX machines that were dumb terminals. They logged into a server that split up the processor ownership among however many people were logged in. What people would do is log into a server and check the number of people logged in. If there were too many, they'd search for the next one, because build times are noticeably slower. I'd never log into the easy to remember server names. =)
It definitely works, but also reduces productivity due to longer build times, especially when multiple people are building at the same time. Since productivity is such a difficult thing to quantify, it might be hard to argue your point.
Graphics acceleration might also be an issue if you need to do anything with animation, video, or image editing. You can't really test video playback through an RDP session since the framerate and/or color depth isn't high enough.
Regardless of performance, at my company we are moving to laptops as developer machines. The main advantage is that developers can bring their computers to meetings, conferences, etc. Also being able to sit next to a colleague when you're helping him with a problem, and having your own development environment available, is very valuable.

Hardware requirements for development machines

Given that:
SSD’s are now [high end] mainstream
Two+ cores are not hard to come across
24+ Inch monitors are plentiful
Dual Video Outputs are the norm.
64-Bit OS’s complement very cheap memory
Can I ask two questions to hardware enthused developers [not the gamers!]
What high-end hardware item could you not develop without - [what is your hardware crutch]?
What should a baseline [no frills] dev machine look like and what basic specs should it have to ensure that any dev can still be productive?
Note: It might be worth mentioning what platform and dev-env your base line is for?
The most important hardware update (and most underrated) is the monitor.
If you're coding 8+ hours a day don't hesitate on costs and get a nice high end 24" at least, or even a pair of them.
Absolute must have is a good monitor which is easy on the eyes, afterall, you stare at it all day. I go with the 24" Samsung (forget model). I used to go with two monitors but prefer the one wide screen now. You need to be able to get docs and code on the same screen.
Secondly is a good chair and desk (sorry not very technical).
Followed lastly by plenty of RAM (2Gb minimum). Once you get over any thrashing due to paging you are fine. Anything with a dual core had enough processing power.
This is entirely dependent upon what you are developing for. Take your target system requirements, and double them and use that as your minimum specs for the dev machines. That may seem odd, but it is about the point I've found that I've needed at least of when developing various projects.
As others have mentioned the importance of getting good monitors, keyboard, and chairs is underrated. If you are going to spend a lot of time at this PC, those are very important.
RAM is cheap, and you'll likely never have enough. If you are running 32bit Windows, max it out at 4GB of RAM. If you are using another OS that supports more than 4GB of ram (Linux, or 64bit Windows for example), start at 8GB minimum, and if you are working on multimedia projects be ready to upgrade from there.
Best bang for the buck on CPUs seems to be Quad cores right now, so I would say that at least a quad core (2.4Ghz or so) should be the minimum. You may not see much difference going up beyond there, until you get until dual quad core, which is a large price jump.
Find a reliable hard drive or two. Reliability and speed are going to be more important than size. Personally I currently go for a pair of 640GB western digital drives in all machines I build.
24 inch or larger monitor
Baseline dev machine would be a 15 inch MacBook Pro with 4GB of RAM. (For web development)
A pair of the fastest hard drives avaílable. I never recognized how much difference separate and fast System and Data drives can make.
(And please, none of those slow SSDs that you usually get nowadays in <$2000 Laptops - if you really want to hop on the SSD train, get a proper one, otherwise you could as well use a 32 GB SDHC Card)
There's been a study on the optimum size of computer monitors by the Utah University
Wall street journal article. Not surprising is that bigger monitors will boost the speed of work. Surprising is that there seems to be an optimum size of 26". There's no explanation why though.
I am not a developer, but do sit at the computer all day.
For me the must have is a desk that is a good height or easily adjusted, I prefer dual monitors, a 26" and a second wide screen that can turn sideways to view documents full lenght without the need for a lot of scrolling, a computer with dual core(prefer 4) and at east 4gb of ram(I tend to do a lot of vm work), and as stated above, a good chair that has lumbar support and will allow me to lean back when I am reading or pondering a situation. The last one is specific for me since I have glasses and tend to hear high frequencies, I prefer to have incandescent lighting with a slightly warm spectrum. I can hear a fluorescent ballast above someone playing loud speakers. I also find I get less glare and I can focus my eyes for longer periods of time with incandescent.
Ram, lots and lots of ram. Ram compensates for many performance bottlenecks.
But do make sure you keep an eye on the memory usage of whatever you're building. When you're building a 60 MB footprint app on a system with 2 gigs of developer tools loaded at run-time, it's easy to lose that footprint in the noise, even when it doubles.
Don't bother shelling out for a high-end cpu. The cpu is the most overpowered component in modern systems. A standard cheap dual-core should be more than enough. Compiles tend to be disk-bound, not cpu bound, so that money is better invested in a faster drive.
Dell Outlet sells 30" LCD monitors for about $800.00.
That is a good place to start.
Besides that, invest time into tweaking your OS to your needs and automate as much as possible.
It's like I keep telling people, "I'll upgrade to the latest Mac when it somehow manages to help me run more Terminal windows and Text Editors." Until then, you're better off saving the money for a new machine and investing it into a decent monitor and keyboard.
It depends on the project.
For large imaging application like medical imaging applications, You may require: large monitors(we have to view the images properly and in detail), powerful graphics, lots of RAM and a good processor(imaging applications usually need lots of power).
I'm going to echo most people on the large monitors part, and you can always make good use of a pair.
Second to that is a good keyboard. What that mean varies depending on which school of keyboard design you subscribe to. I'm with the ergonomic camp.
Following that is 2Gb+ of RAM, and a recent desktop CPU (anything released in the past 2-3 years really).
As has been previously said, large monitors are essential. These days is not that expensive to have 2 hooked up to a machine. At work I'm lucky enough to have 3 hooked up to one PC and it make a huge amount of difference to how I work.
A decent keyboard and mouse are essential. For the last 10 or so years I've always taken my own mouse and keyboard to work as you typically end up with whatever comes from the PC manufacturer. I use a Microsoft ergonomic keyboard and it's very hard to find these in the workplace, or to get your employer to stump up for one, but I've never worked anywhere where the employer has an issue with taking your own in.
High-end hardware I cannot do without:
Kinesis countoured ergonomic keyboard ($300)
Fast twin SATA drives, striped for speed ($150)
Affordable luxuries I could do without:
Dell 30" widescreen monitor ($900)
Twin Velociraptor hard drives ($600)

Multi core programming

I want to get into multi core programming (not language specific) and wondered what hardware could be recommended for exploring this field.
My aim is to upgrade my existing desktop.
If at all possible, I would suggest getting a dual-socket machine, preferably with quad-core chips. You can certainly get a single-socket machine, but dual-socket would let you start seeing some of the effects of NUMA memory that are going to be exacerbated as the core counts get higher and higher.
Why do you care? There are two huge problems facing multi-core developers right now:
The programming model Parallel programming is hard, and there is (currently) no getting around this. A quad-core system will let you start playing around with real concurrency and all of the popular paradigms (threads, UPC, MPI, OpenMP, etc).
Memory Whenever you start having multiple threads, there is going to be contention for resources, and the memory wall is growing larger and larger. A recent article at arstechnica outlines some (very preliminary) research at Sandia that shows just how bad this might become if current trends continue. Multicore machines are going to have to keep everything fed, and this will require that people be intimately familiar with their memory system. Dual-socket adds NUMA to the mix (at least on AMD machines), which should get you started down this difficult road.
If you're interested in more info on performance inconsistencies with multi-socket machines, you might also check out this technical report on the subject.
Also, others have suggested getting a system with a CUDA-capable GPU, which I think is also a great way to get into multithreaded programming. It's lower level than the stuff I mentioned above, but throw one of those on your machine if you can. The new Portland Group compilers have provisional support for optimizing loops with CUDA, so you could play around with your GPU even if you don't want to learn CUDA yourself.
Quad-core, because it'll permit you to do problems where the number of concurrent processes is > 2, which often non-trivializes problems.
I would also, for sheer geek squee, pick up a nice NVidia card and use the CUDA API. If you have the bucks, there's a stand-alone CUDA workstation that plugs into your main computer via a cable and an expansion slot.
It depends what you want to do.
If you want to learn the basics of multithreaded programming, then you can do that on your existing single-core PC. (If you have 2 threads, then the OS will switch between them on a single-core PC. Then when you move to a dual-core PC they should automatically run in parallel on separate cores, for a 2x speedup). This has the advantage of being free! The disadvantages are that you won't see a speedup (in fact a parallel implementation is probably slightly slower due to overheads), and that buggy code has a slightly higher chance of working.
However, although you can learn multithreaded programming on a single-core box, a dual-core (or even HyperThreading) CPU would be a great help.
If you want to really stress-test the code you're writing, then as "blue tuxedo" says, you should go for as many cores as you can easily afford, and if possible get hyperthreading too.
If you want to learn about algorithms for running on graphics cards - which is a very different area to x86 multicore - then get CUDA and buy a normal nVidia graphics card that supports it.
I'd recommend at least a quad-core processor.
You could try tinkering with CUDA. It's free, not that hard to use and will run on any recent NVIDIA card.
Alternatively, you could get a PlayStation 3 and the Linux SDK and work out how to program a Cell processor. Note that the next cheapest option for Cell BE development is an order of magnitude more expensive than a PS3.
Finally, any modern motherboard that will take a Core Quad or quad-core Opteron (get a good one from Asus or some other reputable manufacturer) will let you experiment with a multi-core PC system for a reasonable sum of money.
The difficult thing with multithreaded/core programming is that it opens a whole new can of worms. The bugs you'll be faced with are usually not the one you're used to. Race conditions can remain dormant for ages until they bite and your mainstream language compiler won't assist you in any way. You'll get random data and/or crashes that only happen once a day/week/month/year, usually under the most mysterious conditions...
One things remains true fortunately : the higher the concurrency exhibited by a computer, the more race conditions you'll unveil.
So if you're serious about multithreaded/core programming, then go for as many cpu cores as possible. Keep in mind that neither hyperthreading nor SMT allow for the level of concurrency that multiple cores provide.
I would agree that, depending on what you ultimately want to do, you can probably get by with just your current single-core system. Multi-core programming is basically multi-threaded programming, and you can certainly do that on a single-core chip.
When I was a student, one of our projects was to build a thread-safe implementation the malloc library for C. Even on a single core processor, that was more than enough to cure me of my desire to get into multi-threaded programming. I would try something small like that before you start thinking about spending lots of money.
I agree with the others where I would upgrade to a quad-core processor. I am also a BIG FAN of ASUS Motherboards (the P5Q Pro is excellent for Core2Quad and Core2Duo processors)!
The draw for multi-core programming is that you have more resources to get things done faster. If you are serious about multi-core programming, then I would absolutely get a quad-core processor. I don't believe that you should get the new i7 architecture from Intel to take advantage of multi-core processing because anything written to take advantage of the Core2Duo or Core2Quad will just run better on the newer architecture.
If you are going to dabble in multi-core programming, then I would get a good Core2Duo processor. Remember, it's not just how many cores you have, but also how FAST the cores are to process the jobs. My Core2Duo running at 4GHz routinely completes jobs faster than my Core2Quad running at 2.4GHz even with a multi-core program.
Let me know if this helps!
JFV

CUDA or FPGA for special purpose 3D graphics computations? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am developing a product with heavy 3D graphics computations, to a large extent closest point and range searches. Some hardware optimization would be useful. While I know little about this, my boss (who has no software experience) advocates FPGA (because it can be tailored), while our junior developer advocates GPGPU with CUDA, because its cheap, hot and open. While I feel I lack judgement in this question, I believe CUDA is the way to go also because I am worried about flexibility, our product is still under strong development.
So, rephrasing the question, are there any reasons to go for FPGA at all? Or is there a third option?
I investigated the same question a while back. After chatting to people who have worked on FPGAs, this is what I get:
FPGAs are great for realtime systems, where even 1ms of delay might be too long. This does not apply in your case;
FPGAs can be very fast, espeically for well-defined digital signal processing usages (e.g. radar data) but the good ones are much more expensive and specialised than even professional GPGPUs;
FPGAs are quite cumbersome to programme. Since there is a hardware configuration component to compiling, it could take hours. It seems to be more suited to electronic engineers (who are generally the ones who work on FPGAs) than software developers.
If you can make CUDA work for you, it's probably the best option at the moment. It will certainly be more flexible than a FPGA.
Other options include Brook from ATI, but until something big happens, it is simply not as well adopted as CUDA. After that, there's still all the traditional HPC options (clusters of x86/PowerPC/Cell), but they are all quite expensive.
Hope that helps.
We did some comparison between FPGA and CUDA. One thing where CUDA shines if you can realy formulate your problem in a SIMD fashion AND can access the memory coalesced. If the memory accesses are not coalesced(1) or if you have different control flow in different threads the GPU can lose drastically its performance and the FPGA can outperform it. Another thing is when your operation is realtive small, but you have a huge amount of it. But you cant (e.g. due to synchronisation) no start it in a loop in one kernel, then your invocation times for the GPU kernel exceeds the computation time.
Also the power of the FPGA could be better (depends on your application scenarion, ie. the GPU is only cheaper (in terms of Watts/Flop) when its computing all the time).
Offcourse the FPGA has also some drawbacks: IO can be one (we had here an application were we needed 70 GB/s, no problem for GPU, but to get this amount of data into a FPGA you need for conventional design more pins than available). Another drawback is the time and money. A FPGA is much more expensive than the best GPU and the development times are very high.
(1) Simultanously accesses from different thread to memory have to be to sequential addresses. This is sometimes really hard to achieve.
I would go with CUDA.
I work in image processing and have been trying hardware add-ons for years. First we had i860, then Transputer, then DSP, then the FPGA and direct-compiliation-to-hardware.
What innevitably happened was that by the time the hardware boards were really debugged and reliable and the code had been ported to them - regular CPUs had advanced to beat them, or the hosting machine architecture changed and we couldn't use the old boards, or the makers of the board went bust.
By sticking to something like CUDA you aren't tied to one small specialist maker of FPGA boards. The performence of GPUs is improving faster then CPUs and is funded by the gamers. It's a mainstream technology and so will probably merge with multi-core CPUs in the future and so protect your investment.
FPGAs
What you need:
Learn VHDL/Verilog (and trust me you don't want to)
Buy hw for testing, licences for synthesis tools
If you already have infrastructure and you need to develop only your core
Develop design ( and it can take years )
If you don't:
DMA, hw driver, ultra expensive synthesis tools
tons of knowledge about buses, memory mapping, hw synthesis
build the hw, buy the ip cores
Develop design
Not mentioning of board developement
For example average FPGA pcie card with chip Xilinx ZynqUS+ costs more than 3000$
FPGA cloud is also costly 2$/h+
Result:
This is something which requires resources of running company at least.
GPGPU (CUDA/OpenCL)
You already have hw to test on.
Compare to FPGA stuff:
Everything is well documented .
Everything is cheap
Everything works
Everything is well integrated to programming languages
There is GPU cloud as well.
Result:
You need to just download sdk and you can start.
This is an old thread started in 2008, but it would be good to recount what happened to FPGA programming since then:
1. C to gates in FPGA is the mainstream development for many companies with HUGE time saving vs. Verilog/SystemVerilog HDL. In C to gates System level design is the hard part.
2. OpenCL on FPGA is there for 4+ years including floating point and "cloud" deployment by Microsoft (Asure) and Amazon F1 (Ryft API). With OpenCL system design is relatively easy because of very well defined memory model and API between host and compute devices.
Software folks just need to learn a bit about FPGA architecture to be able to do things that are NOT EVEN POSSIBLE with GPUs and CPUs for the reasons of both being fixed silicon and not having broadband (100Gb+) interfaces to the outside world. Scaling down chip geometry is no longer possible, nor extracting more heat from the single chip package without melting it, so this looks like the end of the road for single package chips. My thesis here is that the future belongs to parallel programming of multi-chip systems, and FPGAs have a great chance to be ahead of the game. Check out http://isfpga.org/ if you have concerns about performance, etc.
FPGA-based solution is likely to be way more expensive than CUDA.
Obviously this is a complex question. The question might also include the cell processor.
And there is probably not a single answer which is correct for other related questions.
In my experience, any implementation done in abstract fashion, i.e. compiled high level language vs. machine level implementation, will inevitably have a performance cost, esp in a complex algorithm implementation. This is true of both FPGA's and processors of any type. An FPGA designed specifically to implement a complex algorithm will perform better than an FPGA whose processing elements are generic, allowing it a degree of programmability from input control registers, data i/o etc.
Another general example where an FPGA can be much higher performance is in cascaded processes where on process outputs become the inputs to another and they cannot be done concurrently. Cascading processes in an FPGA is simple, and can dramatically lower memory I/O requirements while processor memory will be used to effectively cascade two or more processes where there are data dependencies.
The same can be said of a GPU and CPU. Algorithms implemented in C executing on a CPU developed without regard to the inherent performance characteristics of the cache memory or main memory system will not perform as well as one implemented which does. Granted, not considering these performance characteristics simplifies implementation. But at a performance cost.
Having no direct experience with a GPU, but knowing its inherent memory system performance issues, it too will be subjected to performance issues.
CUDA has a fairly substantial code base of examples and a SDK, including a BLAS back-end. Try to find some examples similar to what you are doing, perhaps also looking at the GPU Gems series of books, to gauge how well CUDA will fit your applications. I'd say from a logistic point of view, CUDA is easier to work with and much, much cheaper than any professional FPGA development toolkit.
At one point I did look into CUDA for claim reserve simulation modelling. There is quite a good series of lectures linked off the web-site for learning. On Windows, you need to make sure CUDA is running on a card with no displays as the graphics subsystem has a watchdog timer that will nuke any process running for more than 5 seconds. This does not occur on Linux.
Any mahcine with two PCI-e x16 slots should support this. I used a HP XW9300, which you can pick up off ebay quite cheaply. If you do, make sure it has two CPU's (not one dual-core CPU) as the PCI-e slots live on separate Hypertransport buses and you need two CPU's in the machine to have both buses active.
What are you deploying on? Who is your customer? Without even know the answers to these questions, I would not use an FPGA unless you are building a real-time system and have electrical/computer engineers on your team that have knowledge of hardware description languages such as VHDL and Verilog. There's a lot to it and it takes a different frame of mind than conventional programming.
I'm a CUDA developer with very littel experience with FPGA:s, however I've been trying to find comparisons between the two.
What I've concluded so far:
The GPU has by far higher ( accessible ) peak performance
It has a more favorable FLOP/watt ratio.
It is cheaper
It is developing faster (quite soon you will literally have a "real" TFLOP available).
It is easier to program ( read article on this not personal opinion)
Note that I'm saying real/accessible to distinguish from the numbers you will see in a GPGPU commercial.
BUT the gpu is not more favorable when you need to do random accesses to data. This will hopefully change with the new Nvidia Fermi architecture which has an optional l1/l2 cache.
my 2 cents
Others have given good answers, just wanted to add a different perspective. Here is my survey paper published in ACM Computing Surveys 2015 (its permalink is here), which compares GPU with FPGA and CPU on energy efficiency metric. Most papers report: FPGA is more energy efficient than GPU, which, in turn, is more energy efficient than CPU. Since power budgets are fixed (depending on cooling capability), energy efficiency of FPGA means one can do more computations within same power budget with FPGA, and thus get better performance with FPGA than with GPU. Of course, also account for FPGA limitations, as mentioned by others.
FPGA will not be favoured by those with a software bias as they need to learn an HDL or at least understand systemC.
For those with a hardware bias FPGA will be the first option considered.
In reality a firm grasp of both is required & then an objective decision can be made.
OpenCL is designed to run on both FPGA & GPU, even CUDA can be ported to FPGA.
FPGA & GPU accelerators can be used together
So it's not a case of what is better one or the other. There is also the debate about CUDA vs OpenCL
Again unless you have optimized & benchmarked both to your specific application you can not know with 100% certainty.
Many will simply go with CUDA because of its commercial nature & resources. Others will go with openCL because of its versatility.
FPGAs are more parallel than GPUs, by three orders of magnitude. While good GPU features thousands of cores, FPGA may have millions of programmable gates.
While CUDA cores must do highly similar computations to be productive, FPGA cells are truly independent from each other.
FPGA can be very fast with some groups of tasks and are often used where a millisecond is already seen as a long duration.
GPU core is way more powerful than FPGA cell, and much easier to program. It is a core, can divide and multiply no problem when FPGA cell is only capable of rather simple boolean logic.
As GPU core is a core, it is efficient to program it in C++. Even it it is also possible to program FPGA in C++, it is inefficient (just "productive"). Specialized languages like VDHL or Verilog must be used - they are difficult and challenging to master.
Most of the true and tried instincts of a software engineer are useless with FPGA. You want a for loop with these gates? Which galaxy are you from? You need to change into the mindset of electronics engineer to understand this world.
at latest GTC'13 many HPC people agreed that CUDA is here to stay. FGPA's are cumbersome, CUDA is getting quite more mature supporting Python/C/C++/ARM.. either way, that was a dated question
Programming a GPU in CUDA is definitely easier. If you don't have any experience with programming FPGAs in HDL it will almost surely be too much of a challenge for you, but you can still program them with OpenCL which is kinda similar to CUDA. However, it is harder to implement and probably a lot more expensive than programming GPUs.
Which one is Faster?
GPU runs faster, but FPGA can be more efficient.
GPU has the potential of running at a speed higher than FPGA can ever reach. But only for algorithms that are specially suited for that. If the algorithm is not optimal, the GPU will loose a lot of performance.
FPGA on the other hand runs much slower, but you can implement problem-specific hardware that will be very efficient and get stuff done in less time.
It's kinda like eating your soup with a fork very fast vs. eating it with a spoon more slowly.
Both devices base their performance on parallelization, but each in a slightly different way. If the algorithm can be granulated into a lot of pieces that execute the same operations (keyword: SIMD), the GPU will be faster. If the algorithm can be implemented as a long pipeline, the FPGA will be faster. Also, if you want to use floating point, FPGA will not be very happy with it :)
I have dedicated my whole master's thesis to this topic.
Algorithm Acceleration on FPGA with OpenCL