News:

A forum for users of LackeyCCG

Main Menu

Checksums unveiled?

Started by snowdrop, January 25, 2011, 03:41:30 PM

Previous topic - Next topic

snowdrop

1. How exactly is are the checksums generated by LackeyCCG? (E.g. it's obviously not an md5 or sha. What is it?) We want to write code that generates identical checksums, and would need some info on this.

2. How does LackeyCCG - during an update - compare the checksum in the online and latest version of updatelist.txt with the checksums of the existing cards on the players hd? Does it compare values in the two updatelists or does it actually generate the local checksums again and then checks them against the online checksums in the online updatelist?

3. Must the checksums be generated by LackeyCCG in the updatelist, or could they be created by my own whims, being totally random numbers and/or letters? (Yes, I know that's a bad idea. Just asking though...)

4. What characters are allowed in a legit checksum and how long may the variable be?

Trevor

Any kind of file should work as far as creating checksums. What exactly are you trying to do?

snowdrop

*smiles*

What I try to do is twofolded: a) create a program/script that will take file x and generate the identical checksum as LackeyCCG would have generated for it  and b) understand how LackeyCCG is using and what it is comparing when it does the updates (file checksum that are re-checked with online updatelist, or checksums in local updatelist with checksums in online updatelist)?

Trevor

I wrote that function a long time ago. It was just meant to be a very simple check to see if the file is the same.


int GetCheckSumFromFile(char PathNamePart[])
{
   FILE* fp = FOpen(PathNamePart,"rb");
   if(fp==NULL)
      return 0; // If there is no file, return with a sum of 0.
   long f1,f2=0;
   char TempChar=0;
   int Sum=0;
   do
      {
      f1 = f2;
      TempChar = fgetc(fp);
      f2 = ftell(fp);
      if(TempChar!=10 && TempChar!=13)
         Sum+=(unsigned int)TempChar;
      Sum=Sum%100000000; //modulo to prevent memory type overflow               
      } while(f1 != f2);
   fclose(fp);
   return Sum;
}

If I wrote it now, I would have done it a different way, but I've always wanted to keep backwards compatibility, so the algorithm stayed the same. As you can see, it is literally a sum of the data.

snowdrop

Thanks for sharing. With your "permission" we'd like to re-do the same functionality in PHP in a script that will be open source, and hope that it's okey. It will be for a card database + card dev tool that we're creating and that will automatically create (updated) LackeyCCG patches, among other things. Goal is to, with the press of a button, have it generate a nice package with latest cards/revisions/images etc ready for download and usage in Lackey, or better yet - let it dump the patch on a webhost where players can update the game from directly.

Trevor


amcsi

#6
8 years later, I want to also write a script that builds a plugin, and I need to be able to write the checksum myself :P

I also wanted to write this in PHP.
There is one thing I noticed: I had to subtract 1 to get the final result that would match the examples from e.g. here: https://pottertradingcardgame.webs.com/updatelist.txt

That's not covered in your algorithm...
But what's also not covered is how you get to big negative numbers in the case of seemingly larger files (like images).

How do I get larger images to emit the same large negative numbers, has the algorithm changed?

EDIT: never mind, I figured it out. I didn't realize C did funny conversions when adding a char casted to an unsigned into to a signed int. Chars with values 128 and up effectively got added as negative numbers according to the LackeyCCG algorithm. Here's the fix applied to my PHP code: https://github.com/amcsi/lycee-overture/commit/05599951f0c5f228c911df7f2ed6e674faf0960e

CrazyChucky

#7
Huh... I replicated the checksum function in Python a while back for my own plugin maintenance purposes, but I could only ever get it to work on text files, not images. That's fine, since I only really need it for the text files that get updated, but now this makes me wonder if this is the piece I was missing. Thank you for the info. I may do some fiddling.

The "subtract 1" thing really had me scratching my head too, but it turns out it's because C considers the end-of-file character to equal -1.

(In case it's useful to anyone, here's the function as I currently have it.)

def getChecksum(fileName):
"""Reimplementation of Lackey's checksum function.
It doesn't work right on image files, but that's
okay, I'm only running it on text files.
"""
checksum = 0
with open(fileName, 'rb') as f:
byte = f.read(1)
while byte:
n = int.from_bytes(byte, byteorder='big')
if n != 10 and n != 13:
checksum = checksum + n
checksum %= 100000000
byte = f.read(1)
checksum -= 1 #In C++, EOF returns -1
checksum %= 100000000
print(f'Checksum for {fileName}: {checksum}')
return checksum

CrazyChucky

I got curious again and revisited this. Even with your trick of subtracting 256, amcsi, I wasn't able to get the correct checksum on all image files. It was bugging me, so eventually I just learned how to call C++ from inside Python. No need to replicate C++'s implementation details if I can use it for real!


# in Python
import ctypes
c_checksum = ctypes.CDLL('checksum.so') # would just be 'checksum' on Windows, I think

def get_checksum(filename):
    """Calls the (mostly) original C++ function."""
    # str(filename) is so this function can be called with a Path object too
    string_argument = ctypes.create_string_buffer(str(filename).encode())
    return c_checksum.get_checksum(string_argument)


// checksum.cc, which is compiled to make the shared library
#include <iostream>

extern "C" int get_checksum(char filename[])
{
    FILE* fp = fopen(filename, "rb");
    if(fp == NULL) {
        return 0; // If there is no file, return with a sum of 0.
    }
    long current, next = 0;
    char temp_char = 0;
    int sum = 0;
    do
        {
        current = next;
        temp_char = fgetc(fp);
        next = ftell(fp);
        if (temp_char != 10 && temp_char != 13)
            sum += (unsigned int)temp_char;
        sum %= 100000000; // modulo to prevent memory type overflow
        } while (current != next);
    fclose(fp);
    return sum;
}

lionelpx

I am pleased to let you know I have cracked this  8).

This is what you need to understand from Trevor's code:
  • Yes, it does consume the EOF -1 (\xFF) value, so you need to add it yourself
  • It does read each char as a signed char because in his code, Sum is a (signed) int so, even though he forces temp_char to be cast to unisgned char, it gets cast back to a signed char immediately because of the compound assignment to a signed int. That's how assignment works in C. The cast has no impact, you can remove/ignore it.
  • And the final very sneaky thing: it uses the C modulo operator %. The modulo operator in C does not work as it does in Python (see Guido's explanation). The C modulo is truncated while the Python modulo is floored. While the Python behavior makes sense (in numbers theory, you would go that way), it yields different values for negative integers, which is what makes us fail here. So you have to use a proper C-style truncated modulo implementation. Luckily, Python offers math.fmod.

So this is what you get:
import math

def checksum(path):
    value = 0
    with open(path, "rb") as fp:
        char = fp.peek(1)
        while char:
            char = fp.read(1)
            if char in [b"\n", b"\r"]:
                continue
            if char:
                value += int.from_bytes(char, byteorder="big", signed=True)
            else:
                value -= 1
            value = int(math.fmod(value, 100000000))
    return value

On my end, it works for all files, text or binary, images, and sounds alike. Have fun!