MJay

What is overhead? 본문

Cloud Computing

What is overhead?

MJSon 2017. 5. 6. 18:02

I am a student in Computer Science and I am hearing the word "overhead" a lot when it comes to programs and sorts. What does this mean exactly?

asked May 18 '10 at 19:04

yuudachi
6471816
16  
how much "extra stuff" you need to do in order to get something. e.g. If I have to load up a 37 class project just to print "Hello World" I would consider that a lot of overhead. – scunliffe May 18 '10 at 19:05
    
Thanks everyone – yuudachi May 18 '10 at 19:20
7  
< looks up... > – Pete Kirkham May 18 '10 at 19:41
1  
@doug65536 Actually it's the other way around. =) – Fukuzawa Yukio Mar 18 '13 at 8:47
up vote 103 down vote accepted

It's the resources required to set up an operation. It might seem unrelated, but necessary.

It's like when you need to go somewhere, you might need a car. But, it would be a lot of overhead to get a car to drive down the street, so you might want to walk. However, the overhead would be worth it if you were going across the country.

In computer science, sometimes we use cars to go down the street because we don't have a better way, or it's not worth our time to "learn how to walk".

answered May 18 '10 at 19:06

corsiKa
55.6k19112166
51  
A similar analogy would be flying. Planes are much faster than cars, but the overhead of airport check-in, security, etc. makes cars a better option for shorter distances. – FogleBird May 18 '10 at 19:27
    
s/drive/go/ (If you're needing to drive somewhere you don't usually decide to walk... – RCIX May 19 '10 at 7:18
    
Thanks RCIX - good call. – corsiKa May 20 '10 at 20:27
6  
it's not worth our time to "learn how to walk". - epic :D – inf3rno Apr 7 '14 at 0:52
1  
@inf3rno The irony? How do we get to our car? We walk. And we can totally walk... to our car. We can not walk to our destination, even if it's closer than our car. – corsiKa Aug 28 '14 at 19:29

The meaning of the word can differ a lot with context. In general, it's resources (most often memory and CPU time) that are used, which do not contribute directly to the intended result, but are required by the technology or method that is being used. Examples:

  • Protocol overhead: Ethernet frames, IP packets and TCP segments all have headers, TCP connections require handshake packets. Thus, you cannot use the entire bandwidth the hardware is capable of for your actual data. You can reduce the overhead by using larger packet sizes and UDP has a smaller header and no handshake.
  • Data structure memory overhead: A linked list requires at least one pointer for each element it contains. If the elements are the same size as a pointer, this means a 50% memory overhead, whereas an array can potentially have 0% overhead.
  • Method call overhead: A well-designed program is broken down into lots of short methods. But each method call requires setting up a stack frame, copying parameters and a return address. This represents CPU overhead compared to a program that does everything in a single monolithic function. Of course, the added maintainability makes it very much worth it, but in some cases, excessive method calls can have a significant performance impact.
answered May 18 '10 at 19:44

Michael Borgwardt
258k52382620
    
Sounds like the word has the same meaning in all of those examples (necessary to execute the task, but not always related to doing it directly) – RCIX May 19 '10 at 7:20
    
Re data structure memory overhead: With most memory allocators, it's even worse than that. Each value returned by malloc has a built-in overhead of 8 bytes due to the allocator (assuming classic 32-bit machine) consisting of the size of the block plus guard values. And that's before you even think about allocation granularity. A singly-linked list of simple 4-byte integers will therefore have an overhead of 75%; arrays are much better (unless you need fast insertion in the middle) because they can have the overhead once (or less, if the array isn't dynamically allocated). – Donal Fellows May 20 '10 at 20:35

Wikipedia has us covered:

In computer science, overhead is generally considered any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to attain a particular goal. It is a special case of engineering overhead.

answered May 18 '10 at 19:06

Matthew Jones
17.7k860132
2  
But if it didn't, you would fix WikiPedia, and then make the same post here. – SamGoody Jan 31 '11 at 20:50

Overhead typically reffers to the amount of extra resources (memory, processor, time, etc.) that different programming algorithms take.

For example, the overhead of inserting into a balanced Binary Tree could be much larger than the same insert into a simple Linked List (the insert will take longer, use more processing power to balance the Tree, which results in a longer percieved operation time by the user).

answered May 18 '10 at 19:06

Justin Niessner
188k22316459

You're tired and cant do any more work. You eat food. The energy spent looking for food, getting it and actually eating it consumes energy and is overhead!

Overhead is something wasted in order to accomplish a task. The goal is to make overhead very very small.

In computer science lets say you want to print a number, thats your task. But storing the number, the setting up the display to print it and calling routines to print it, then accessing the number from variable are all overhead.

answered May 18 '10 at 19:44

Laz
2,10252447
1  
This is actually a very good definition! +1 – RCIX May 19 '10 at 7:20

For a programmer overhead refers to those system resources which are consumed by your code when it's running on a giving platform on a given set of input data. Usually the term is used in the context of comparing different implementations or possible implementations.

For example we might say that a particular approach might incur considerable CPU overhead while another might incur more memory overhead and yet another might weighted to network overhead (and entail an external dependency, for example).

Let's give a specific example: Compute the average (arithmetic mean) of a set of numbers.

The obvious approach is to loop over the inputs, keeping a running total and a count. When the last number is encountered (signaled by "end of file" EOF, or some sentinel value, or some GUI buttom, whatever) then we simply divide the total by the number of inputs and we're done.

This approach incurs almost no overhead in terms of CPU, memory or other resources. (It's a trivial task).

Another possible approach is to "slurp" the input into a list. iterate over the list to calculate the sum, then divide that by the number of valid items from the list.

By comparison this approach might incur arbitrary amounts of memory overhead.

In a particular bad implementation we might perform the sum operation using recursion but without tail-elimination. Now, in addition to the memory overhead for our list we're also introducing stack overhead (which is a different sort of memory and is often a more limited resource than other forms of memory).

Yet another (arguably more absurd) approach would be to post all of the inputs to some SQL table in an RDBMS. Then simply calling the SQL SUM function on that column of that table. This shifts our local memory overhead to some other server, and incurs network overhead and external dependencies on our execution. (Note that the remote server may or may not have any particular memory overhead associated with this task --- it might shove all the values immediately out to storage, for example).

Hypothetically might consider an implementation over some sort of cluster (possibly to make the averaging of trillions of values feasible). In this case any necessary encoding and distribution of the values (mapping them out to the nodes) and the collection/collation of the results (reduction) would count as overhead.

We can also talk about the overhead incurred by factors beyond the programmer's own code. For example compilation of some code for 32 or 64 bit processors might entail greater overhead than one would see for an old 8-bit or 16-bit architecture. This might involve larger memory overhead (alignment issues) or CPU overhead (where the CPU is forced to adjust bit ordering or used non-aligned instructions, etc) or both.

Note that the disk space taken up by your code and it's libraries, etc. is not usually referred to as "overhead" but rather is called "footprint." Also the base memory your program consumes (without regard to any data set that it's processing) is called its "footprint" as well.

answered May 18 '10 at 19:22

Jim Dennis
9,88743764

You could use a dictionary. The definition is the same. But to save you time, Overhead is work required to do the productive work. For instance, an algorithm runs and does useful work, but requires memory to do its work. This memory allocation takes time, and is not directly related to the work being done, therefore is overhead.

answered May 18 '10 at 19:06

Yann Ramin
28.8k14270

In computer science, overhead is generally considered any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to attain a particular goal. It is a special case of engineering overhead.

For example, you have a program that runs in 10 minutes when given 1 CPU, but takes 6 minutes when given 2 CPUs, there is 1 minute of synchronization/communication overhead on both CPUs for the 2 CPU case, because if there is no overhead at all, it should only take 5 minutes.

In essence, extra memory or taken cpu cycles which may or may not be considered necessary to accomplish a specific computational task.

answered Nov 18 '16 at 5:55

jessica
163

You can check Wikipedia. But mainly when more actions or resources are used. Like if you are familiar with .NET there you can have value types and reference types. Reference types have memory overhead as they require more memory than value types.

answered May 18 '10 at 19:09

Incognito
12.6k74069

A concrete example of overhead is the difference between a "local" procedure call and a "remote" procedure call.

For example, with classic RPC (and many other remote frameworks, like EJB), a function or method call looks the same to a coder whether its a local, in memory call, or a distributed, network call.

For example:

service.function(param1, param2);

Is that a normal method, or a remote method? From what you see here you can't tell.

But you can imagine that the difference in execution times between the two calls are dramatic.

So, while the core implementation will "cost the same", the "overhead" involved is quite different.

answered May 18 '10 at 19:59

Will Hartung
81.1k1592175

Think about the overhead as the time required to manage the threads and coordinate among them. It is a burden if the thread does not have enough task to do. In such a case the overhead cost over come the saved time through using threading and the code takes more time than the sequential one.

answered Jan 31 '15 at 20:47

Anas
5210

Your Answer

 

Sign up or log in

Sign up using Google

Sign up using Facebook

Sign up using Email and Password

Post as a guest

Name
Email

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.


'Cloud Computing' 카테고리의 다른 글

코드보면서 공부해보기  (0) 2017.05.10
CherryPick PPT 만들었습니다.  (0) 2017.05.10
Celery 간단 요약 정리 2  (0) 2017.04.09
Celery 간단 요약 정리 1  (0) 2017.04.09
Condor-annes vs Celery  (0) 2017.04.02