Bunny Mark Performance

RayekRayek Posts: 21Member
edited July 2017 in General Chat

https://www.scirra.com/forum/performance-in-fairness_t191870

How is it possible that a web canvas-based engine is much faster than Godot in this benchmark? Is Godot's version somehow badly coded? Can it be optimized?

Tagged:

Tags :

Comments

  • RayekRayek Posts: 21Member

    The benchmarks are available here: https://github.com/colludium/bunnymark-tests

    Unity's performance leaves Godot in the dust, but even the javascript version is much faster. I feel something must be wrong here.

  • NeoDNeoD Posts: 130Member
    edited May 2017

    I guess the Godot scene instancing is very slow.

    For a large amount of object it should be possible to draw the bunnies with a single 2DNode, but rewriting the bunnymark this way is beyond my skills.

  • Shin-NiLShin-NiL Posts: 154Moderator

    Converting the logic to a CPP module can give a little boost, but still not so performant :(

    https://github.com/Shin-NiL/Godot-BunnyMark-CPP

  • _807__807_ Posts: 57Member

    The "script" bottleneck is not in Bunny.gd is in main.gd. Is not the .instance()... if I have the time i will do my version of bunnymark...

    But belive me one thing, construct is only more performant that godot in one thing:

    And this thing is......: NOTHING!

    (I suppose that unity should be more performant, they have more resources, and more of all...they have the "corporative structure", have their "rent close-invasive-software policy", "lot of workers", etc.... is realy to be surprised that godot is getting closer to this... We will see how much Godot 3 is closer, for now we don´t know. I would do proofs in unity, but i don´t install virus in my computer)

  • RukiriRukiri Posts: 57Member

    not surprised Unity fares better, I mean the capabilities are really beyond godot and that's due well they have the $$$, manpower, and resources to do so. I thought about incorporating Vulkan into Godot but to maintain it would be a hassle...

  • _807__807_ Posts: 57Member

    Ok, I did test:

    first of all: It´s true that in Bunnymark the bottleneck is gdscript and not the renderer. The main reason is that every bunny have their own script... there are... A lot of scripts!!!

    I start my bunnytest falling to 58 FPS at 3800 bunnies. (In my computer there is the mark that do the original proyect from https://github.com/colludium/bunnymark-tests

    I do some changes in both scripts (main.gd and bunny.gd)

    Now the bunnytest falling to 58 FPS at 4400 bunnies.

    Both proofs in godot preview. If somebody want do the export proof I put the code below.

  • _807__807_ Posts: 57Member
    edited May 2017

    Modificated main.gd code:

    extends Node2D
    var bunny = preload('res://bunny.scn')
    var bunnyCount = 0
    var elapsed = 0
    var NODE_fps
    var NODE_bunnyCount
    var fps
    var BunnySpawnIterator
    var BunnyInstance
    
    func _ready():
        # Initalization here
        NODE_fps = get_node("fps")
        NODE_bunnyCount = get_node("bunnyCount")
        set_process(true)
        addBunny(10)
    
    func _process(delta):
        if Input.is_action_pressed('mouse_down'):
            addBunny(10)
        elapsed = elapsed + delta
        # Update fps text once per second
        if elapsed > 1:
            fps = OS.get_frames_per_second()
            NODE_fps.set_text("FPS: " + str(fps))
            NODE_fps.set_as_toplevel(1)
            NODE_bunnyCount.set_as_toplevel(1)
            elapsed = 0
    
    func addBunny(n):
        BunnySpawnIterator = 0
        while BunnySpawnIterator < n:
            BunnySpawnIterator += 1
            BunnyInstance= bunny.instance()
            add_child(BunnyInstance)
            bunnyCount = bunnyCount + 1
        var count = "Bunnies: " + str(bunnyCount)
        NODE_bunnyCount .set_text(count)
    
  • _807__807_ Posts: 57Member

    Modificated bunny.gd code:

            extends Sprite
            var v = Vector2(randi() % 200 + 50,randi() % 200 + 50 )
            const ay = 980
            var pos
    
            func _ready():
                set_pos(Vector2(50,50))
                pos = get_pos()
                set_process(true)
    
            func _process(delta):
                v.y += ay * delta
                pos += Vector2(v) * delta
                if pos.y > 600:
                    v.y *= -0.85 
                    pos.y = 600
                    if randf() > 0.5:
                        v.y = -(rand_range(50,1150))
                elif pos.y < 0:
                    v.y = 0
                    pos.y = 0   
                elif pos.x > 800:
                    v.x = -v.x
                    pos.x = 800
                elif pos.x < 0:
                    v.x = abs(v.x)
                    pos.x = 0
                set_pos(pos)
    
  • _807__807_ Posts: 57Member
    edited May 2017

    I think that I can gain some extra FPS with simplified code. But I´m not continue with this. Simply understand that is you want have more that 4000 objects every one with their own script you need to code very very very performant-thinking. More that 4000 _process funcs allocating memory every tick is some that I thing "insane" (in gdscript in c++ etc...)

    Some thing that i do in the script:

            1) eliminate redundant pos=get_pos()
            2) change float x and y values to a Vector 2
            3) cange If/elif positions (bunny touches more times the floor that the walls
            4)..... i don´t remember more, compare the codes.
    

    And I´m sure that I can do better (An I´m not a programmer, I´m a designer).

    If this "Bunnymark" check performance, you need to do the code performant. But I think this test is not representative of nothing.

  • NeoDNeoD Posts: 130Member

    807 It would be possible to go without this bunny script. The official demo called "Shower of bullets" is an example of a script who fits better to a large number of objects.

    Here is the main comment of the bullet.gd file.

    # This demo is an example of controling a high number of 2D objects with logic and collision without using scene nodes.
    # This technique is a lot more efficient than using instancing and nodes, but requires more programming and is less visual
    
  • _807__807_ Posts: 57Member
    edited May 2017

    Yes, but you need to battle with godot internal servers and the sugar of godot is the easiest code, the fast-develop, if you need to do something like the logic of "Shower of bullets" maybe is better to rework slowly your game design. How many bullet-hell have 4000 "scripted" bullets in screen? Enought pixels for the bullets? ;) (Note: the exported version of the reworked bunnymark handle 7000 bunnies at 60 fps in my I5)

    Realy, if somebody considerate to move from one engine to other by tests like "bunnymark" i told to him that stop procastinate a make NOW that f***** game.

  • _807__807_ Posts: 57Member

    Bad news with the "shower of bullets". I do the test with this method and "bunnyscript" (there is no frame drops in instance objects with the .new() and the performance is the same.

    A question: There is any option to undo the right button "delete all" option?... I do that thinking that I only delete a function and the entire script goes "black"...

  • Shin-NiLShin-NiL Posts: 154Moderator
    edited May 2017

    807 said:
    Bad news with the "shower of bullets". I do the test with this method and "bunnyscript" (there is no frame drops in instance objects with the .new() and the performance is the same.

    Yeah, I still thinking that the bottleneck is on_process. Even using the "shower of bullets" approach you continue iterating over each bunny and executing its logic.

    Maybe this little experiment might interest you.

  • _807__807_ Posts: 57Member

    Yes... Before start use Godot I did a lot of experiment like this, and I have a lot of "intuition" conclusions. For loop ----> slow, local variables ----> slow , get_node -----> very slow. And I have "performant" scripts now, but there is something in _process and _fixed_process, that is not performant in certains limits. I think that it can be related with the frame time, vsync or something like this... Example: The bunnyscript with "shower of bullet" aproach goes at 60 FPS until it down to 30 FPS directly. It make sense that if you have a lot of scripts it can handle better the "idle" time (the original script goes down frame by frame), but the results are basically the same: at certains numbers of nodes or at certain number of aritmetics, the engine goes down... No matters you have to render something or not (Maybe the servers?). I try bunnyscript without "bunnys" and the performance is equally. Other not performant things are "Light2d" and "Add" mix mode... But all this things are no relevant to a lot of games. I use construct for 2 years and the develop speed of godot is huge compared to that, the renderer equally, you can execute the games in old computers (not like engines with directX and scripts compiled like GM and jojo, or unity, but a lot of performance compared with construct and Node webkit and other "html5" based like this). If you have limits in mind at the moment to start develop there is nothing that you can not do (In 2d, 3d for my is "fog of war"), and is a good habit start thinking in the basics of the game, and writing the code tiny from beggining.

    P.D. If I re-do the Shower of bullets bunnyscript i put that here... godot have a very very very irritating feature that I have discovered today, the "delete all" in script, this feature is like a kick in the balls, like insulting a mother or something worst!!!!

  • _807__807_ Posts: 57Member

    Here we are:

    "Shower of bunnys", same performance that actual bunnyMark but don´t stutters at instantiate bunnys:

    extends Node2D
    #"Shower of bunnys"
    #Godot 2.1.3
    #Mix between "Shower of bullets" demo and "bunnyMark" test for Godot.
    #Change main.gd in bunnymark with this script.
    
    var bunnyCount = 0
    var countConcatenate
    var elapsed = 0
    var fps
    var NODE_fps
    var NODE_bunnyCount
    var bunnys = []
    var mat #matrix32
    var bunnytexture = preload("res://sprite-sheet0.tex")
    
    class Bunny:
        var pos = Vector2()
        var body = RID()
        var v = Vector2(randi() % 200 + 50,randi() % 200 + 50 )
        const ay = 980
    
    func _ready():
        NODE_fps = get_node("fps")
        NODE_bunnyCount = get_node("bunnyCount")
        set_fixed_process(true)
        set_process(true)
        addBunny(10)
    
    func _fixed_process(delta):
        if Input.is_action_pressed('mouse_down'):
            addBunny(10)
        elapsed = elapsed + delta
        if elapsed > 1: # Update fps text once per second
            fps = OS.get_frames_per_second()
            NODE_fps.set_text("FPS: " + str(fps))
            NODE_fps.set_as_toplevel(1)
            NODE_bunnyCount.set_as_toplevel(1)
            elapsed = 0
    
    func _draw():
        for bunnyiterator in bunnys:
            draw_texture(bunnytexture, bunnyiterator.pos)
    
    func _process(delta):
        mat = Matrix32()
        for bunnyiterator in bunnys:
            bunnyiterator.v.y += bunnyiterator.ay * delta
            bunnyiterator.pos += bunnyiterator.v * delta
            if bunnyiterator.pos.y > 600:
                bunnyiterator.v.y *= -0.85 
                bunnyiterator.pos.y = 600
                if randf() > 0.5:
                    bunnyiterator.v.y = -(rand_range(50,1150))
            elif bunnyiterator.pos.y < 0:
                bunnyiterator.v.y = 0
                bunnyiterator.pos.y = 0 
            elif bunnyiterator.pos.x > 800:
                bunnyiterator.v.x = -bunnyiterator.v.x
                bunnyiterator.pos.x = 800
            elif bunnyiterator.pos.x < 0:
                bunnyiterator.v.x = abs(bunnyiterator.v.x)
                bunnyiterator.pos.x = 0
            mat.o = bunnyiterator.pos
            Physics2DServer.body_set_state(bunnyiterator.body, Physics2DServer.BODY_STATE_TRANSFORM, mat)
        update()
    
    func addBunny(number):
        for i in range(number):
            var newbunny = Bunny.new()
            newbunny.body = Physics2DServer.body_create(Physics2DServer.BODY_MODE_STATIC)
            Physics2DServer.body_set_space(newbunny.body, get_world_2d().get_space())
            newbunny.pos = Vector2(50,50)
            mat = Matrix32()
            mat.o = newbunny.pos
            Physics2DServer.body_set_state(newbunny.body, Physics2DServer.BODY_STATE_TRANSFORM, mat)
            bunnys.append(newbunny)
            bunnyCount = bunnyCount + 1
        countConcatenate = "Bunnies: " + str(bunnyCount)
        NODE_bunnyCount .set_text(countConcatenate)
    
  • _807__807_ Posts: 57Member

    This test is amuse me... It can be posible make _process calculation in a shader?... Maybe this up performance.

  • Shin-NiLShin-NiL Posts: 154Moderator

    807 I've just discovered something interesting. The getters, setters and properties accessing seems to be a little slow. Using my original code I got:

    Changing it to avoid using Vector2 properties:

    extends Sprite
    
    var velocity_x = 0
    var velocity_y = 0
    var gravity = 3
    var max_x = 640
    var min_x = 0
    var max_y = 480
    var min_y = 0
    
    
    func _ready():
        var tex = preload("res://wabbit_alpha.png")
        set_texture(tex)
        velocity_x= randf() * 10
        velocity_y = rand_range(5, 10)
        set_process(true)
    
    func _process(delta):
        var pos = get_pos()
        var pos_x = pos.x
        var pos_y = pos.y
    
        pos_x += velocity_x
        pos_y += velocity_y
        velocity_y += gravity
    
        if (pos_x > max_x):
            velocity_x*= -1
            pos_x = max_x
        elif (pos_x < min_x):
            velocity_x*= -1
            pos_x = min_x
    
        if (pos_y > max_y):
            velocity_y *= -0.8
            pos_y = max_y
            if (randf() > 0.5):
                velocity_y -= randf() * 12;
    
        elif (pos_y < min_y):
            velocity_y = 0;
            pos_y = min_y;
    
        set_pos(Vector2(pos_x, pos_y))
    

    I got:

  • _807__807_ Posts: 57Member
    edited July 2017

    Shin_NiL:

    Great discovery!!! You are right, bunnymarkt resolved... is not the math, is not the script languaje, is not the render... is the set_get, and specifically the "set" function. If change to comment the "get_pos" line the performance is same (little better) but If I turn on comment in the line of "set_pos" performance ups * 1,5 (edited, more proofs done), so the problem are not global vs local vars, aritmetics (edited: not sure, more proofs done!), excesive number of nodes, excesive number of scripts with process turn on, if-elif order or rendering bunnys, the bottleneck is set_pos, great discovery!

    This test make me crazy because I test gdscript with construct 2 events before do change of engine and godot results for pure loops and aritmetics smash construct javascript by far, so I wasn´t understanding this.....
    But in my tests only do things like **while ** blablabla: **var **+=1 etcetc... didn´t use getters and setters :innocent: :neutral:

  • _807__807_ Posts: 57Member
    edited July 2017

    Shin_NiL:

    Only for curiosity I made your velocity test in Love2d.... and the results are depresive... I put the code:

        counter = 0
        tiempo_inicio = love.timer.getTime()
        loop1 = 0
        loop2 = 0
        loop3 = 0
        for i=1,10 do
            loop1 = loop1 + 1
            for i=1,3200 do
                loop2 = loop2 + 1
                for i=1,3200 do
                loop3 = loop3 + 1
                counter = counter + 1
                if counter > 50 then
                    counter = 0
                end
                end
            end
        end
        tiempo_fin = love.timer.getTime()
        tiempo_total = tiempo_fin - tiempo_inicio
        function love.draw()
            love.graphics.print(tiempo_inicio, 400, 250)
            love.graphics.print(loop1, 400, 275)
            love.graphics.print(loop2, 400, 300)
            love.graphics.print(loop3, 400, 325)
            love.graphics.print(tiempo_fin, 400, 350)
            love.graphics.print("Tiempo total en segundos", 400, 375)
            love.graphics.print(tiempo_total, 400,450)
        end
    

    Results are....
    Godot need 35 seconds in editor and 27 seconds in export release without debug mode, no matters if use while or for loop.
    Love2d need 0,16 seconds in the same operation.
    Obviously there´s something wrong with that. Is Interpreted-Lua 150 times faster than Gdscript/python?

  • Shin-NiLShin-NiL Posts: 154Moderator

    This was already expected, since Lua is one of the fastest scripting languages we have out there. I agree there is room for improvement in GDScript. At least GDScript is on par with Python in this case :wink:

  • _807__807_ Posts: 57Member

    Other:

    godot 3.0 fixnum win build 13/06/2017.
    Visual Script- 10x3200x3200 iterations test (editor preview- "for loop" approach):
    Results: 2 min 9 sec.
    VScript loses vs GDscript: 3.5 times slow in 10x3200x3200 test.

    I don´t know c or c++, but it could be cool to do the test in GDNative and/or in module... to know exactly the basic loop computation up with both.

    I can´t paste the code... there are rectangles :D

  • Shin-NiLShin-NiL Posts: 154Moderator

    I opened an issue about our discussion.

  • _807__807_ Posts: 57Member

    More test about GDscript and access to API velocity. Done in editor preview. Sorry about code repetition but is performance test and function call (like test say) have a bit of overload.

    The test:

    extends Node2D
    
    const numberofiterations = 1000000
    
    var InitTimer
    var EndTimer
    const BasicNumber1 = 1
    const BasicNumber2 = 2
    var VarNumber1 = 1
    var VarNumber2 = 2
    var FloatNumber1 = 1.06542315654
    var FloatNumber2 = 2.35684456213
    const VectorNumber = Vector2(2,2)
    enum {NUMBER1 = 1, NUMBER2 = 2}
    var result = 0
    
    func _ready():
        print ("___________________________________________________________")
        print ("GODOT 2.1.3. Velocity GDSCript and API function access test")
        print ("___________________________________________________________")
        print ("0- CONSTANTS INT " +str(numberofiterations)+" x2 SUM iteration")
    
        result = 0
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            result += BasicNumber1
            result += BasicNumber2
        EndPerformanceTest()
    
        print ("\n1-INT VAR " +str(numberofiterations)+" x2 SUM iteration")
        result = 0
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            result += VarNumber1
            result += VarNumber2
        EndPerformanceTest()
    
        print ("\n2-FLOAT VAR " +str(numberofiterations)+" x2 SUM iteration")
        result = 0
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            result += FloatNumber1
            result += FloatNumber2
        EndPerformanceTest()
    
        print ("\n3-ENUM CONST " +str(numberofiterations)+" x2 SUM iteration")
        result = 0
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            result += NUMBER1
            result += NUMBER2
        EndPerformanceTest()
    
        print ("\n4-VECTOR2 sum VEC2CONST " +str(numberofiterations)+" x1 SUM iteration")
        result = Vector2()
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            result += VectorNumber
        EndPerformanceTest()
    
        print ("\n5-VECTOR2 sum VEC2CONST to members " +str(numberofiterations)+" independent XY SUM iteration")
        result = Vector2()
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            result.x += VectorNumber.x
            result.y += VectorNumber.y
        EndPerformanceTest()
    
        print("\n6-FUNCTIONCALL SUM IN SCRIPT GLOBAL VAR " +str(numberofiterations)+" x2 ITERATIONS")
        result = 0
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            SUM2Numbers(VarNumber1,VarNumber2)
        EndPerformanceTest()
    
        print("\n7-SET_POS NO CHANGES OPERATION " +str(numberofiterations)+" ITERATIONS")
        result = 0
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            set_pos(Vector2(0,0))
        EndPerformanceTest()
    
        print("\n8-GET_POS NO CHANGES OPERATION " +str(numberofiterations)+" ITERATIONS")
        print("")
        result = 0
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            var pos = get_pos()
        EndPerformanceTest()
    
        print("\n9-SET_POS Node2D X+CONST Y+CONST " +str(numberofiterations)+" ITERATIONS")
        result = 0
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            set_pos(get_pos()+Vector2(BasicNumber1,BasicNumber2))
        EndPerformanceTest()
    
        print("\n10-SET_POS Node2D X+CONST Y+CONST Aproach1 " +str(numberofiterations)+" ITERATIONS")
        result = 0
        InitPerformanceTest()
        var TemporalX = 0
        var TemporalY = 0
        var Temporal = Vector2()
        for i in range (1,numberofiterations):
            TemporalX += BasicNumber1
            TemporalY += BasicNumber2
            Temporal = Vector2(TemporalX,TemporalY)
            set_pos(get_pos()+Temporal)
        EndPerformanceTest()
    
        print("\n11-SET_POS Node2D X+CONST Y+CONST Aproach2 " +str(numberofiterations)+" ITERATIONS")
        result = 0
        InitPerformanceTest()
        var TemporalX = 0
        var TemporalY = 0
        for i in range (1,numberofiterations):
            TemporalX += BasicNumber1
            TemporalY += BasicNumber2
            set_pos(Vector2(get_pos().x+TemporalX,get_pos().y+TemporalY))
        EndPerformanceTest()
    
        print("\n12- ARRAY APPEND " +str(numberofiterations)+" x2 ITERATIONS")
        result = Array()
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            result.append(BasicNumber1)
            result.append(BasicNumber1)
        EndPerformanceTest()
    
        print("\n13- ARRAY ACCESS " +str(numberofiterations*2)+" ITERATIONS")
        InitPerformanceTest()
        for i in result:
            var Temp = result[i]
        EndPerformanceTest()
    
        print("\n14- DICTIONARY APPEND " +str(numberofiterations)+" x2 ITERATIONS")
        result = {}
        InitPerformanceTest()
        for i in range (1,numberofiterations):
            result["key"+str(i)]=i
            result["key"+str(i+numberofiterations)]=i+numberofiterations
        EndPerformanceTest()
    
        print("\n15- DICTIONARY ACCESS " +str(numberofiterations*2)+" ITERATIONS")
        InitPerformanceTest()
        for i in result:
            var Temp = result[i]
        EndPerformanceTest()
    
    func SUM2Numbers(Number1, Number2):
            result += Number1
            result += Number2
    
    func InitPerformanceTest():
        InitTimer = OS.get_ticks_msec()
    
    func EndPerformanceTest():
        EndTimer = OS.get_ticks_msec()
        print ("Operation performance: "+str(EndTimer-InitTimer)+" msec")
    
  • _807__807_ Posts: 57Member

    My result:


    GODOT 2.1.3. Velocity GDSCript and API function access test


    0- CONSTANTS INT 1000000 x2 SUM iteration
    Operation performance: 286 msec

    1-INT VAR 1000000 x2 SUM iteration
    Operation performance: 295 msec

    2-FLOAT VAR 1000000 x2 SUM iteration
    Operation performance: 292 msec

    3-ENUM CONST 1000000 x2 SUM iteration
    Operation performance: 299 msec

    4-VECTOR2 sum VEC2CONST1000000 x1 SUM iteration
    Operation performance: 192 msec

    5-VECTOR2 sum VEC2CONST to members 1000000 independent XY SUM iteration
    Operation performance: 1361 msec

    6-FUNCTIONCALL SUM IN SCRIPT GLOBAL VAR 1000000 x2 ITERATIONS
    Operation performance: 549 msec

    7-SET_POS NO CHANGES OPERATION 1000000 ITERATIONS
    Operation performance: 412 msec

    8-GET_POS NO CHANGES OPERATION 1000000 ITERATIONS
    Operation performance: 334 msec

    9-SET_POS Node2D X+CONST Y+CONST 1000000 ITERATIONS
    Operation performance: 786 msec

    10-SET_POS Node2D X+CONST Y+CONST Aproach1 1000000 ITERATIONS
    Operation performance: 1017 msec

    11-SET_POS Node2D X+CONST Y+CONST Aproach2 1000000 ITERATIONS
    Operation performance: 1693 msec

    12- ARRAY APPEND 1000000 x2 ITERATIONS
    Operation performance: 536 msec

    13- ARRAY ACCESS 2000000 ITERATIONS
    Operation performance: 328 msec

    14- DICTIONARY APPEND 1000000 x2 ITERATIONS
    Operation performance: 3762 msec

    15- DICTIONARY ACCESS 2000000 ITERATIONS
    Operation performance: 747 msec

  • _807__807_ Posts: 57Member
    edited July 2017

    I made a export build and have several opinion notes:

    ·Vector2 x and y operations have optimizations that give x2 performance. Even so... direct operation in x and y members of vector 2 is 3 times slow that direct operation with float values.

    ·Vector2 us Vector2 direct operations have same performance that float us float direct operations. That point is good, but realy I have no idea how to take advantage of this without access sometimes vector members, to dificult to me think always in vector math and puntualy access members can lower performance more that the increment gains with direct vector manipulation.

    ·int us int, int us float, etc... etc... have same performance, does not matters type, "const" types gives very very very tiny "up" in performance. "enums" type too, but tiniest. Last 2 point probably are some wrong in my test, or other thing...

    ·Array are faster that Dictionary (probably by dictionary index). In editor array access is x2 faster, in release x3. The test is 1 million of iterators with numbers, I don´t test bigger data or other types. Array is generic array, it is assumed to be faster that typed arrays with few entries, i read this in some doc.

    ·What surprises me the most is the penalty of pass parameters to a function. Is not the code of functions supposed to be copied to the necessary sites when exporting? (doing the test with ten millions of simple add number iterations with and without function calling is 5 us 3 seconds, with "code replacement" in compiled script´s performance should be the same? Is not that? I'm wrong with this?)

  • Shin-NiLShin-NiL Posts: 154Moderator

    Very interesting. So Zylann was right about Vector math operations.

  • _807__807_ Posts: 57Member

    Good news in Godot simple iteration test:


    GODOT 3.0 Calinou 16/07/2017 alpha custom buid - Velocity GDSCript and API function access test in preview


    0- CONSTANTS INT 1000000 x2 SUM iteration
    Operation performance: 257 msec

    1-INT VAR 1000000 x2 SUM iteration
    Operation performance: 256 msec

    2-FLOAT VAR 1000000 x2 SUM iteration
    Operation performance: 260 msec

    3-ENUM CONST 1000000 x2 SUM iteration
    Operation performance: 258 msec

    4-VECTOR2 sum VEC2CONST 1000000 x1 SUM iteration
    Operation performance: 153 msec

    5-VECTOR2 sum VEC2CONST to members 1000000 independent XY SUM iteration
    Operation performance: 1113 msec

    6-FUNCTIONCALL SUM IN SCRIPT GLOBAL VAR 1000000 x2 ITERATIONS
    Operation performance: 488 msec

    6.1-FUNCTIONCALL funcref SUM IN SCRIPT GLOBAL VAR 1000000 x2 ITERATIONS
    Operation performance: 630 msec

    7-SET_POS NO CHANGES OPERATION 1000000 ITERATIONS
    Operation performance: 322 msec

    8-GET_POS NO CHANGES OPERATION 1000000 ITERATIONS
    Operation performance: 284 msec

    9-SET_POS Node2D X+CONST Y+CONST 1000000 ITERATIONS
    Operation performance: 673 msec

    10-SET_POS Node2D X+CONST Y+CONST Aproach1 1000000 ITERATIONS
    Operation performance: 930 msec

    11-SET_POS Node2D X+CONST Y+CONST Aproach2 1000000 ITERATIONS
    Operation performance: 1496 msec

    12- ARRAY APPEND 1000000 x2 ITERATIONS
    Operation performance: 463 msec

    13- ARRAY ACCESS 2000000 ITERATIONS
    Operation performance: 313 msec

    14- DICTIONARY APPEND 1000000 x2 ITERATIONS
    Operation performance: 2787 msec

    15- DICTIONARY ACCESS 2000000 ITERATIONS
    Operation performance: 745 msec

  • _807__807_ Posts: 57Member

    I think that performance are very dependent from the system/version of compiler... But for now, Godot 3.0 in preview is faster that godot 2.1.3 in preview (I can´t test exports because there is no templates to this version)

  • user_iduser_id Posts: 1Member

    Hi,
    I can't find anything relevant about gpu array operations (something ala arrayfire). Is it possible with gdscript ?

Sign In or Register to comment.