Module: ContextSpook::Utils
- Defined in:
- lib/context_spook/utils.rb
Overview
The ContextSpook::Utils module provides utility methods for formatting and processing context data.
Class Method Summary collapse
-
.estimate_tokens(text) ⇒ Integer
The estimate_tokens method provides a crude estimation of token counts based on the byte size of the input content.
-
.format_size(context_size) ⇒ String
The format_size method converts a byte size value into a human-readable string with binary units.
-
.format_tokens(tokens) ⇒ String
The format_tokens method converts a token count into a human-readable string using SI prefixes (e.g., 1.2 kT).
Instance Method Summary collapse
-
#estimate_tokens(text) ⇒ Integer
private
The estimate_tokens method provides a crude estimation of token counts based on the byte size of the input content.
-
#format_size(context_size) ⇒ String
private
The format_size method converts a byte size value into a human-readable string with binary units.
-
#format_tokens(tokens) ⇒ String
private
The format_tokens method converts a token count into a human-readable string using SI prefixes (e.g., 1.2 kT).
Class Method Details
.estimate_tokens(text) ⇒ Integer
The estimate_tokens method provides a crude estimation of token counts based on the byte size of the input content.
This follows a heuristic where one token is roughly equivalent to 3.5 bytes.
38 39 40 |
# File 'lib/context_spook/utils.rb', line 38 def estimate_tokens(text) (text.size.to_f / 3.5).ceil end |
.format_size(context_size) ⇒ String
The format_size method converts a byte size value into a human-readable string with binary units.
This method takes a raw byte count and formats it using the Tins::Unit library to display the size with appropriate binary prefixes (KiB, MiB, etc.) and two decimal places.
16 17 18 |
# File 'lib/context_spook/utils.rb', line 16 def format_size(context_size) Tins::Unit.format(context_size, format: '%.2f %U', unit: ?b, prefix: 1024) end |
.format_tokens(tokens) ⇒ String
The format_tokens method converts a token count into a human-readable string using SI prefixes (e.g., 1.2 kT).
26 27 28 |
# File 'lib/context_spook/utils.rb', line 26 def format_tokens(tokens) Tins::Unit.format(tokens, unit: ?T, prefix: :si_uc, format: '%.1f %U') end |
Instance Method Details
#estimate_tokens(text) ⇒ Integer (private)
The estimate_tokens method provides a crude estimation of token counts based on the byte size of the input content.
This follows a heuristic where one token is roughly equivalent to 3.5 bytes.
38 39 40 |
# File 'lib/context_spook/utils.rb', line 38 def estimate_tokens(text) (text.size.to_f / 3.5).ceil end |
#format_size(context_size) ⇒ String (private)
The format_size method converts a byte size value into a human-readable string with binary units.
This method takes a raw byte count and formats it using the Tins::Unit library to display the size with appropriate binary prefixes (KiB, MiB, etc.) and two decimal places.
16 17 18 |
# File 'lib/context_spook/utils.rb', line 16 def format_size(context_size) Tins::Unit.format(context_size, format: '%.2f %U', unit: ?b, prefix: 1024) end |
#format_tokens(tokens) ⇒ String (private)
The format_tokens method converts a token count into a human-readable string using SI prefixes (e.g., 1.2 kT).
26 27 28 |
# File 'lib/context_spook/utils.rb', line 26 def format_tokens(tokens) Tins::Unit.format(tokens, unit: ?T, prefix: :si_uc, format: '%.1f %U') end |