Work with git objects from Python

21 August 2017 ยท 1 minute read

Git store data in files in .git/objects.

Python has a zlib module. We want to read git files data with python:

>>> import zlib
>>> f = open('.git/objects/9d/aeafb9864cf43055ae93beb0afd6c7d144bfa4', 'rb')
>>> data = f.read()
>>> print(data)
b'x\x01K\xca\xc9OR0e(I-.\xe1\x02\x00\x19\xa6\x03\xbf'
>>> zlib.decompress(data)
b'blob 5\x00test\n'
>>>

This blob means that this is blob-type file (git uses 3 types of file: blob, commit and tree) and 5 is lenght of content test\n:

Also we can check the hash:

>>> import hashlib
>>> hashlib.sha1(zlib.decompress(data)).hexdigest()
'9daeafb9864cf43055ae93beb0afd6c7d144bfa4'
>>>

Now we can see that hash is exactly the same like we used before.

Create your own object file

Pro Git book shows how to create object file in Ruby. But to create object file in python we need the same modules that we use for reading files plus os module to create directory:

>>> import zlib
>>> import hashlib
>>> import os
>>> s = 'blob 11\x00new string\x00'.encode()

Get the hash:

>>> hash = hashlib.sha1(s).hexdigest() # 90cc64bc4738805499e19ab3bab69ecb2c3a16c0
>>> os.mkdir('.git/objects/' + hash[0:2]) # .git/objects/90 do it if necessary
>>> f = open('.git/objects/' + hash[0:2] + '/' + hash[2:], 'wb')
>>> f.write(zlib.compress(s))
27
>>> f.close()

Now we have new well-formed git object in .git/objects/90.

comments powered by Disqus
github