Once again, I should be working on my thesis, so I'm dumping thoughts here so they stop taking up VRAM.
Tech things
Making good progress on trying to figure out the structure of replay files. I bit the bullet and downloaded Ghidra, which is extremely overkill but I'm enjoying it. It lets me define types to chunk the memory into, which is mostly uint for these files with a few ulonglongs. It also automatically finds two-byte unicode, which is great (though I did have to adjust it in a few places to fix offsets because it didn't automatically include the trailing 00.) It also lets me add various types of comments!
Here's some information I've already found in replay files that looks promising:
**************************************************************
* Round start information? *
**************************************************************
00000480 92 79 f3 03 uint 66288018
00000484 66 dc cc 90 uint 2429344870
00000488 d9 11 00 00 uint 4569
0000048c 01 00 00 00 uint 1
00000490 01 00 00 00 uint 1
00000494 00 00 00 00 uint 0
00000498 01 00 00 00 uint 1
0000049c 00 00 00 00 uint 0
000004a0 a0 86 01 00 uint 100000
burst meter?
000004a4 a0 86 01 00 uint 100000
burst meter p2?
I want to test if x480 and x484 could be a ulonglong instead of a uint. My hunch is it's some form of checksum or timestamp or other index so that the game knows where to skip to when you tell it to skip to next round.
I think I've also found the stage and music information. I'll need to change these values to test, but it seems logical:
00000398 02 00 00 00 uint 2
# of rounds to win
0000039c 63 00 00 00 uint 99
seconds per round
000003a0 0f 00 00 00 uint 15
unknown - stage or music?
000003a4 20 00 00 00 uint 32
unknown - stage or music?
000003a8 04 64 00 00 uint 25604 repeated in 00000414
000003ac 00 00 00 00 uint 0
000003b0 19 00 00 00 uint 25
?
000003b4 00 00 00 00 uint 0
000003b8 01 00 00 00 uint 1 repeated in 00000424
000003bc 05 00 00 00 uint 5
There's one more interesting value I need to find which is how long the players allowed the stage intro and character intro to play out before skipping. I'm sure that's stored somewhere because that skip delay is consistent with the real match. I'd guess that's the values at x480 and x484, since players can also choose to skip the per-round win animation, but that doesn't quite work out because there's only one animation to skip there but two animations to skip at match start. My guess is it's some ulonglong representing ms after which to skip. (maybe that much isn't necessary and x488 suffices?)
I'll need to compare between known replays to check. Have one where we don't skip, one where we instantly skip, and one where we skip a little bit in. Keep everything else consistent (players, characters, stages, etc. if possible). This'll be a nice thing to know so maybe I can programmatically update replays to skip intros so I don't have to worry about editing them out in post. Or add them back for cinematic purposes.
There's a few more values I bet I can find in replays I need to sus out. Interestingly, it doesn't seem like icons are saved in replays, so I don't have to worry about those. No player card information is saved at all, in fact, other than username and rank. (Will need to be able to translate between named ranks and whatever number they use to represent it.)
What's mildly annoying about all of this is despite how powerful Ghidra is, it seems like there still isn't really a way to copy "annotations" from one file to another. The closest I can do is define structs, put those structs into a project archive, and re-use them in another file. I think that's what I'm going to do -- just build up more and more complex structs that end up containing the overall structure of the file. For example, I made a struct that encompasses the entirety of the date representation, including the unix epoch representation, the numeric representation, and the string representation.
I'm running into a small bit of an issue over player names. I can't use the same player struct for the recording player metadata, because that makes everything off by two bytes. I'm not sure what that's about. Maybe I need to make my player strings include two more 00 00s just to keep things consistent.
I'd put an example of these structs here, but I've already closed it for the day and I think my brain will fold in on itself if I try to re-align it to that work again.
Software Design
More I work on this the more I agree with my friend's opinion of making a library and then making a separate UI. I probably want to make a library for actually munging the replay and replay_list files, because now I have two UIs I want to make.
- The user friendly
replay_listrepair andreplay.datreplacement UI - A tool I can use to edit replays by hand while testing what values mean what.
Additional thought
It seems like the memory representation of a replay may be identical to the way it's written in a binary file, if the offsets in the improvement mod's headers are anything to go by. Instead of modifying the files, maybe I just download cheatengine and fuck with the values in game? Alternatively, I edit a file and load it using IM to see what changes so I don't have to worry about keeping replay_list or checksums in check while testing.
I'm a little worried about using cheatengine just because it's such a red flag and I'm very bad about making sure applications aren't running before doing other things. Even my friend got instabanned from Warframe because he accidentally opened it while trying to open Word, although he managed to get his account unbanned by explaining that it was an accident.