Code Monkey home page Code Monkey logo

Comments (17)

sverx avatar sverx commented on August 22, 2024 1

OK you can try now. I have added
SMS_loadTileMapColumnatAddr (unsigned int dst, const void *src, unsigned int height)
and the macro
SMS_loadTileMapColumn(x,y,src,height)
so for instance to update 24 tiles in vertical starting from the one at 0,0 you can issue
SMS_loadTileMapColumn (0,0,source_data,24);
where source_data is a pointer to unsigned int - or an array of unsigned int with at least 24 elements.

The function is native asm and pretty fast, and just reordering the asm instructions I made it safe anyway so it doesn't need to be marked as UNSAFE as it anyway can't be faster than this.

from devkitsms.

xfixium avatar xfixium commented on August 22, 2024 1

Wow, that is blazing fast now! It seems to work as expected, thank you so much. I do have one concern, however. While this method works perfect for my needs, I'm wondering how popular it would be for people using a standard horizontal formatted tilemap? Just throwing that thought out there.

from devkitsms.

xfixium avatar xfixium commented on August 22, 2024 1

Valid points, thanks again!

from devkitsms.

sverx avatar sverx commented on August 22, 2024 1

I realized it could be done even faster, so I just pushed an update.

Regarding this topic, instead:

Also also, I'd like to note, an UNSAFE version of SMS_VRAMmemset(W) might come in handy for various other things I'm trying to accomplish manipulating the VRAM directly. Not just this update column scenario.

would you mind elaborating on it? I'd like to understand what are your specific needs for something like that.

from devkitsms.

xfixium avatar xfixium commented on August 22, 2024 1

Yes, I need to re-crunch the numbers just so I'm not talking out my butt. It mainly has to do with sprite processing. I'm currently on vacation, so my response has been delayed until I have free time.

from devkitsms.

sverx avatar sverx commented on August 22, 2024 1

OK, so I'm closing this.

Thank you for suggesting an useful new feature! 😃

from devkitsms.

sverx avatar sverx commented on August 22, 2024

The reason why setting horizontally consecutive tiles is fast and any other way is slow is because the SMS VDP automatically auto-increments the destination address, so you can write a stream of bytes to VRAM without setting a new address only as long as the new write address is the next one as the previous write.

This means that the only way to set a single column of tiles is to issue one SMS_setNextTileatXY and one SMS_setTile for each tile, there's no practical workaround.

Regarding your second request, I will look into it. It shouldn't be that hard to add that.

from devkitsms.

sverx avatar sverx commented on August 22, 2024

To better clarify: the SMS VDP can only auto-increment the VRAM address using a fixed +1 offset and there's no way to set any other delta (as opposed to the MegaDrive/Genesis VDP, IIRC).

from devkitsms.

xfixium avatar xfixium commented on August 22, 2024

I see. I did read something like that when looking over various SMS documentation. I thought maybe there might be some more customization that could be set, or maybe there was more going on under the hood in Devkit SMS. I do appreciate you looking into the UNSAFE variants. That may help in various operations. Thank you.

from devkitsms.

sverx avatar sverx commented on August 22, 2024

An alternative way to update a single column (or a few of them) could be to use the SMS_loadTileMapArea function.
But you would need to prepare the data in a source buffer.
Unfortunately, the current implementation is pretty slow. But I might rewrite that in asm if that's something that you can find useful.

from devkitsms.

sverx avatar sverx commented on August 22, 2024

Regarding your second request, what's your case scenario using "UNSAFE versions of SMS_VRAMmemset and SMS_VRAMmemsetW"? I have a few different options regading how to code them but the more I know about how they will be actually used the best option I will likely be able to pick...

from devkitsms.

xfixium avatar xfixium commented on August 22, 2024

I have tried SMS_loadTileMapArea as well as some other methods. The fastest is still SMS_setNextTileatXY and SMS_setTile, in an unrolled loop. I thought if this couldn't be made faster, that maybe just writing words through an UNSAFE SMS_VRAMmemset could do the job instead. As I'm creating the update during the vblank. I thought the UNSAFE versions could be useful in other areas as well.

from devkitsms.

sverx avatar sverx commented on August 22, 2024

I see, so you would say use one UNSAFE_SMS_VRAMmemsetW call (if there was such a thing) instead of calling SMS_setNextTileatXY and SMS_setTile, but still looping over them - or in an unrolled loop.

I suspect this would save very few cycles anyway, so maybe what could be added instead is some function that updates a single column of the tilemap in a single call, getting the values from a buffer. This way there would be less function call overhead and the function logic would also handle the destination pointer in VRAM, adding 64 bytes to it at each loop. It wouldn't even need to be UNSAFE, which means you could even call that when the vblank is going to finish and it will eventually even partially run outside vblank without issues - and you won't notice screen tearing because it's updating the screen top to bottom so if you start when still in vblank, then the screen raster won't catch you.

Does this sound like a possible solution?

from devkitsms.

xfixium avatar xfixium commented on August 22, 2024

Honestly, a few cycles go a long way.

I like the sound of what you propose, I'm just not clear how that would be implemented. Are you suggesting using something like SMS_VRAMmemsetW? Replacing SMS_setNextTileatXY and SMS_setTile combination? As I believe I have tried this before. I thought it would be quicker, but it was not. I'll revisit it to be sure, however. Maybe this is where an UNSAFE version of SMS_VRAMmemsetW would come in handy?

Also, some more detail. The tiles are in vertical alignment within the ROM. I currently just point to a section of it, and apply it via the unrolled SMS_setNextTileatXY and SMS_setTile sequentially.

Also also, I'd like to note, an UNSAFE version of SMS_VRAMmemset(W) might come in handy for various other things I'm trying to accomplish manipulating the VRAM directly. Not just this update column scenario.

I know I'm probably viewing this in a very uneducated perspective, so please bare with me XD. I obviously don't want to introduce any unnecessary bloat to the library. I have some unorthodox methods I would like to vet, and there seems to be some possible performance gains to be had.

from devkitsms.

sverx avatar sverx commented on August 22, 2024

What I suggested is kind of a special form of SMS_loadTileMapArea that updates just a single tile per each row (thus updating a column) starting from the pointed tile and looping down as fast as possible (yet without resorting to an unrolled loop). I think it could save several cycles by avoiding dozens call overheads.

Library bloat isn't much an issue as long as I keep the functions in separate modules. The SDCC linker won't link the modules containing unused functions anyway.

from devkitsms.

sverx avatar sverx commented on August 22, 2024

If you code an horizontally scrolling engine, you're likely to arrange your map in column-major order. Beside that, in any other case where one needs to update a single column, one could just allocate a small buffer in RAM and populate that with the data needed for the update. Think compressed maps, for example...

from devkitsms.

xfixium avatar xfixium commented on August 22, 2024

Actually, I think you can disregard my other request. I think through some testing, I found a solution that may work. Thank you for your time, knowledge, and skill.

from devkitsms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.