Module: OllamaChat::Utils::PNGCharacterExtractor
- Defined in:
- lib/ollama_chat/utils/png_character_extractor.rb
Overview
Extracts embedded character profiles from PNG image files.
This module specifically looks for 'tEXt' chunks within a PNG file that contain a 'chara' keyword, which is expected to hold a Base64 encoded JSON string.
Class Method Summary collapse
-
.extract_character_json(io) ⇒ String?
Extracts the character profile JSON from the provided IO object.
Instance Method Summary collapse
-
#extract_character_json(io) ⇒ String?
private
Extracts the character profile JSON from the provided IO object.
Class Method Details
.extract_character_json(io) ⇒ String?
Extracts the character profile JSON from the provided IO object.
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
# File 'lib/ollama_chat/utils/png_character_extractor.rb', line 15 def extract_character_json(io) data = if io.respond_to?(:binmode) io.binmode io.read elsif io.respond_to?(:binread) io.binread else return nil end # PNG Signature is 8 bytes: \x89PNG\r\n\x1a\n # We start reading chunks after the signature pos = 8 while pos < data.length # 1. Read Length (4 bytes, Big Endian) length = data[pos, 4].unpack1('L>') pos += 4 # 2. Read Chunk Type (4 bytes) type = data[pos, 4] pos += 4 # 3. Read Chunk Data chunk_data = data[pos, length] pos += length # 4. Skip CRC (4 bytes) pos += 4 # We are only interested in 'tEXt' chunks if type == 'tEXt' # tEXt chunks are formatted as: Keyword + NULL Byte + Text # We split only on the first NULL byte keyword, text = chunk_data.split("\x00", 2) if keyword == 'chara' begin # The content should be Base64 encoded UTF-8 JSON decoded_json = Base64.decode64(text) decoded_json = decoded_json.encode('UTF-8', invalid: :replace, undef: :replace) JSON.parse(decoded_json) return decoded_json rescue JSON::ParserError, ArgumentError return nil end end end end end |
Instance Method Details
#extract_character_json(io) ⇒ String? (private)
Extracts the character profile JSON from the provided IO object.
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
# File 'lib/ollama_chat/utils/png_character_extractor.rb', line 15 def extract_character_json(io) data = if io.respond_to?(:binmode) io.binmode io.read elsif io.respond_to?(:binread) io.binread else return nil end # PNG Signature is 8 bytes: \x89PNG\r\n\x1a\n # We start reading chunks after the signature pos = 8 while pos < data.length # 1. Read Length (4 bytes, Big Endian) length = data[pos, 4].unpack1('L>') pos += 4 # 2. Read Chunk Type (4 bytes) type = data[pos, 4] pos += 4 # 3. Read Chunk Data chunk_data = data[pos, length] pos += length # 4. Skip CRC (4 bytes) pos += 4 # We are only interested in 'tEXt' chunks if type == 'tEXt' # tEXt chunks are formatted as: Keyword + NULL Byte + Text # We split only on the first NULL byte keyword, text = chunk_data.split("\x00", 2) if keyword == 'chara' begin # The content should be Base64 encoded UTF-8 JSON decoded_json = Base64.decode64(text) decoded_json = decoded_json.encode('UTF-8', invalid: :replace, undef: :replace) JSON.parse(decoded_json) return decoded_json rescue JSON::ParserError, ArgumentError return nil end end end end end |