Page 3 of 7

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 02:01
by linusa
spillerrec wrote: About the frame rate, I would say it really depends on the situation. In a FPS game, you can easily see the difference between 45 and 60 fps. (At least I can.) While in a racing game I often let the fps drop as low as 20 fps without caring much. 12 fps is way to slow for many applications though.
For a NXT display I would say 12 fps is just fine in most cases. It is better to have a stable slow frame rate than to have a fast choppy frame rate imo. (Btw, in NXT RPG I use 6.67 fps as frame rate, even though it is more than twice as fast as that.)
About the FPS framerate: I originally thought "people who think they can distinguish between framerate > 40 fps are imagining things", but it's true. I've got my own theory why this might be the case: It's not only display update rate that is faster with 60 fps (compared to lets say 45), it's also the sensor input update rate -- i.e. the whole game loop (at least on games of the Half-life 1 era) runs faster. And I believe this is what gamers sense: The faster reaction time, the smoother movement of ther characters.
So when making games and having a limited display update performance, it might as well be worth it to update user input more frequently in a separate thread. Ideally you would also calculate user and enemy positions more often, but this might again lead to performance problems.

What I mean is, that not only update rate, but also latency is important: How long does it take for my user input to get on the screen (as opposed to how often does it get onto the screen).

So, to get back to the NXC game: Maybe maybe try seperating "game logic", "user input" and "display thread", and don't let the interactive part get delayed by the slow drawing routines...

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 04:49
by muntoo
Could someone post all the .ric files? (Or even better - all the files!)

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 12:13
by spillerrec
I do not think it is user input rate. I am a veteran Stepmania player (take a quick look at the link if you are not familiar with Rhythmic games). Half of the game is basically decoding the arrows which appears on the screen: which key should you press and when. The other half is pressing the key. The interesting aspect here is that as soon you know when it needs to be pressed, there is no reason to look at the screen anymore.
So to get to the point, in this game I often do not look at the top row which visually react on user input at all. Yet there is still a big difference between 30 and 60 fps (I can't confirm between 45 and 60 as the FPS counter normally is disabled, but I would say the likelihood is high).
I think that while we might not be able to properly see more than 12 fps, we might be able to sense it. Consider that a black ball flew past you with incredible speed, you might not see the ball but you would most likely see a gray shadow of something which passed by.

But I fully agree with your opinion about latency. Sloppy input can really kill a otherwise good game. It was quite noticeable when programming the input in NXT RPG, the slow frame rate easily caused issues. I changed the game so I could stop the main game loop until any input was detected. While input is handled by the main loop, input is detected in a separate thread so input will not be dropped while it is drawing or doing other tasks.


Well, I took a better look at the code this time. (Sorry Linusa, we can't profile it since we lack the level and graphic files, it wouldn't run properly without them.)
Honestly, nxtboy, I think you should try to take another look at your code to see if you can improve it yourself. Take a look at this part:

Code: Select all

{
	PROFILER_BEGINSECTION(9);
	E.eli=0;
	map[E.mex][E.mey].sld=0;
	switch(E.sc)
	{
	case 'L':
		map[E.mex+1][E.mey].sld=0;
		map[E.mex][E.mey].sld=0;
		E.sc='N';
		break;
	case 'R':
		map[E.mex+1][E.mey].sld=0;
		map[E.mex][E.mey].sld=0;
		E.sc='N';
		break;
	}
	switch(E.sc)
	{
	case 'D':
		map[E.mex][E.mey+1].sld=0;
		map[E.mex][E.mey].sld=0;
		E.sc='N';
		break;
	}
	jumped=1;
	PROFILER_ENDSECTION(9);
}
Notice you set map[x][y].sld to 0 before the switch structure. Now in all cases, you set map[x][y].sld to 0 again which really serves no use. And then you have two switch structures, which doesn't share any cases, again, completely unneeded. case 'L' and 'R' are completely the same, these can be combined. Result:

Code: Select all

{
	PROFILER_BEGINSECTION(9);
	E.eli=0;
	map[E.mex][E.mey].sld=0;
	switch(E.sc)
	{
	case 'L':
	case 'R':
		map[E.mex+1][E.mey].sld=0;
		E.sc='N';
		break;
	case 'D':
		map[E.mex][E.mey+1].sld=0;
		E.sc='N';
		break;
	}
	jumped=1;
	PROFILER_ENDSECTION(9);
}
There is still a lot you should be able to improve yourself right now, please do this before asking others.

You have some if-then-else-if-then-else-if... structures which should be replaced with a switch:

Code: Select all

if(msa == 192){
	//something1
}
else if(msa == 96){
	//something2
}
else{
	//something3
}
should be:

Code: Select all

switch( msa ){
	case 192:	/*something1*/	break;
	case 96:	/*something2*/	break;
	default:	/*something3*/	
}
If structures are very inefficiently implemented in NXC (considering what they could be), so avoid them if you can.

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 12:51
by HaWe
spillerec,
I'm curious if switch/break is really faster than if-else-if... because I normally also use rather "if" than "case" structures (I hate those breaks all over the time)
If it really is: how many percent actually?
I never thought that there might be a speed issue, but if there really is, I will have to rethink and reprogram my related source codes as well!

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 16:27
by spillerrec

Code: Select all

task main(){
	int x = 3, y;
	if( x == 1 )
		y = 1;
	else if( x == 2 )
		y = 2;
	else
		y = 3;
	
	NumOut(0,0,y);
}
becomes

Code: Select all

	mov __signed_stack_001main, __constVal3
	set __D0main, 1
	cmp 4, __zfmain, __signed_stack_001main, __D0main
	mov __D0main, __zfmain
	brtst 4, __NXC_Label_608, __zfmain
	set __main_7qG2_y_7qG2_000, 1
	jmp __NXC_Label_611
__NXC_Label_608:
	mov __signed_stack_001main, __main_7qG2_x_7qG2_000
	set __D0main, 2
	cmp 4, __zfmain, __signed_stack_001main, __D0main
	mov __D0main, __zfmain
	brtst 4, __NXC_Label_614, __zfmain
	set __main_7qG2_y_7qG2_000, 2
	jmp __NXC_Label_617
__NXC_Label_614:
	set __main_7qG2_y_7qG2_000, 3
__NXC_Label_617:
__NXC_Label_611:
( 19 lines )
There are a few opcodes which I thought optimization level 3 would have removed for the "if then else" method, but they are still there apparently...

Code: Select all

task main(){
	int x = 3, y;
	
	switch( x ){
		case 1: y = 1; break;
		case 2: y = 2; break;
		default: y = 3;
	}
	
	NumOut(0,0,y);
}
becomes

Code: Select all

	mov __D0main, __constVal3
	brcmp 4, __NXC_Label_609, __constVal1, __D0main
	brcmp 4, __NXC_Label_612, __constVal2, __D0main
	jmp __NXC_Label_615
	jmp __NXC_Label_608
__NXC_Label_609:
	set __main_7qG2_y_7qG2_000, 1
	jmp __NXC_Label_608
__NXC_Label_612:
	set __main_7qG2_y_7qG2_000, 2
	jmp __NXC_Label_608
__NXC_Label_615:
	set __main_7qG2_y_7qG2_000, 3
__NXC_Label_608:
( 14 lines )
Actually, the switch should have placed the default case just under the brcmp opcodes, this removes a jmp opcode (and the extra one which doesn't really do anything shouldn't be there either of course):

Code: Select all

	mov __D0main, __constVal3
	brcmp 4, __NXC_Label_609, __constVal1, __D0main
	brcmp 4, __NXC_Label_612, __constVal2, __D0main
	set __main_7qG2_y_7qG2_000, 3
	jmp __NXC_Label_608
__NXC_Label_609:
	set __main_7qG2_y_7qG2_000, 1
	jmp __NXC_Label_608
__NXC_Label_612:
	set __main_7qG2_y_7qG2_000, 2
__NXC_Label_608:
( 11 lines )
Notice how a switch statement is build up. First a list of comparison statements, one for each case. If any of these success, it jumps to the label. Then comes the contents of the cases, each starts with a label followed with the code. If you end the case with a break, it adds the jmp opcode which jumps to the end of the structure.

It is not like it is a huge issue, but it is more efficient to use a switch.
And as Linusa previously advised with && and ||, you should order the cases depending on how likely they are. With a switch, place the most likely cases at the top, the less likely at the bottom.

Also:

Code: Select all

if( x == 1 )
   if( y == 2 )
      //code
is more efficient than

Code: Select all

if( x == 1 && y == 2 )
   //code
I really hope John will improve the compilation of if structures someday...

Disclaimer: The line counts are for reference; while it might give an indication for people new to NBC, it cannot be used as an efficiency meter.

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 16:52
by linusa
spillerrec wrote: Also:

Code: Select all

if( x == 1 )
   if( y == 2 )
      //code
is more efficient than

Code: Select all

if( x == 1 && y == 2 )
   //code
I really hope John will improve the compilation of if structures someday...
Oh, is it really more efficient? Because of jumps, or because of the "saved evaluation" of (y == 2)? I think that short-circuit is mandatory in C and C++, so you can write:

Code: Select all

if ( (x != 0) && (d > (y/x)) ) {
  // code 
In this case, (y/x) with x == 0 never gets executed, so you don't get an "div by 0" error. So this should be 100% equivalent to

Code: Select all

if( x != 0 )
   if( (d > (y/x) )
      //code
, no?

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 18:10
by spillerrec
linusa wrote:Oh, is it really more efficient? Because of jumps, or because of the "saved evaluation" of (y == 2)?
Neither. The NXC compiler adds some bloat code when converting boolean expressions containing &&. The NXC compiler does a very poor job at converting boolean expressions into NBC in general, and when && is involved it becomes even worse. (Something similar with || iirc.)
So in short, using the expanded versions gets around some quirks in the NXC compiler...
(I didn't do a throughout test of it so I don't know the speed impact. The instruction count definitely fell with the expanded version though.)

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 20:04
by nxtboyiii
:oops: I'm sorry I did not post in a while. I will just post the entire program. I have updated the nxc files in the "source" folder so forget about the code I posted earlier and use this code. Just drag everything from the "bin" folder to the NXT and run the program.


Credits:

Game artists: Nintendo, Me, turiqwalrus, sidneys1, zippydee, yunhua98, ashbad, stryker001, my brother,

minecraft game,YoManRuLz, and xedall2358

These are from omnimaga.org, mindboards, and mfgg.net.

EDIT:

E.mex is the x of it on the map
E.mey is the y of it on the map
E.fex is the float of the x of it on the map, since enemies move 1/8 of a tile at a time
E.fey is the float of the y of it on the map, since enemies move 1/8 of a tile at a time

EDIT2: Oops. I left the .nxc files in the bin folder. Do not put those on your NXT(duh). :) Same with the .sym file.

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 20:52
by HaWe
spillerrec wrote:
linusa wrote:Oh, is it really more efficient? Because of jumps, or because of the "saved evaluation" of (y == 2)?
Neither. The NXC compiler adds some bloat code when converting boolean expressions containing &&. The NXC compiler does a very poor job at converting boolean expressions into NBC in general, and when && is involved it becomes even worse. (Something similar with || iirc.)
So in short, using the expanded versions gets around some quirks in the NXC compiler...
(I didn't do a throughout test of it so I don't know the speed impact. The instruction count definitely fell with the expanded version though.)
what's the "expanded version"?
And how much is which version now faster than which other and how much (is it really worth while taking one or the other version or is it just a theory?)?

Re: Need help with efficiency NXC code

Posted: 15 Aug 2011, 23:26
by spillerrec
With the "expanded" version I mean the one which uses two if statements. The difference between the two is minimal though so personally I wouldn't bother. Much more can be gained by rewriting it in NBC anyway...
The speed difference depends on the contents of the structure. Consider these two cases:

Code: Select all

if( x == 2 && y == 3 )
   f = 42;

Code: Select all

if( x == 2 && y == 3 )
   f = calculate_pi();
In the first case most of the execution time would be spend evaluating x and y. If you had to do this for every item of an array, you would get a noticeable performance improvement. However in the second case 99% of the time would be spend in calculate_pi() and thus worrying about the efficiency of the if statement would be a waste of time.