Module: ContextSpook::Utils

Defined in:
lib/context_spook/utils.rb

Overview

The ContextSpook::Utils module provides utility methods for formatting and processing context data.

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.estimate_tokens(text) ⇒ Integer

The estimate_tokens method provides a crude estimation of token counts based on the byte size of the input content.

This follows a heuristic where one token is roughly equivalent to 3.5 bytes.

Parameters:

  • text (String)

    the content to be estimated

Returns:

  • (Integer)

    the estimated number of tokens



38
39
40
# File 'lib/context_spook/utils.rb', line 38

def estimate_tokens(text)
  (text.size.to_f / 3.5).ceil
end

.format_size(context_size) ⇒ String

The format_size method converts a byte size value into a human-readable string with binary units.

This method takes a raw byte count and formats it using the Tins::Unit library to display the size with appropriate binary prefixes (KiB, MiB, etc.) and two decimal places.

Parameters:

  • context_size (Integer)

    the size in bytes to be formatted

Returns:

  • (String)

    the formatted size string with binary units



16
17
18
# File 'lib/context_spook/utils.rb', line 16

def format_size(context_size)
  Tins::Unit.format(context_size, format: '%.2f %U', unit: ?b, prefix: 1024)
end

.format_tokens(tokens) ⇒ String

The format_tokens method converts a token count into a human-readable string using SI prefixes (e.g., 1.2 kT).

Parameters:

  • tokens (Integer)

    the number of tokens to be formatted

Returns:

  • (String)

    the formatted token string



26
27
28
# File 'lib/context_spook/utils.rb', line 26

def format_tokens(tokens)
  Tins::Unit.format(tokens, unit: ?T, prefix: :si_uc, format: '%.1f %U')
end

Instance Method Details

#estimate_tokens(text) ⇒ Integer (private)

The estimate_tokens method provides a crude estimation of token counts based on the byte size of the input content.

This follows a heuristic where one token is roughly equivalent to 3.5 bytes.

Parameters:

  • text (String)

    the content to be estimated

Returns:

  • (Integer)

    the estimated number of tokens



38
39
40
# File 'lib/context_spook/utils.rb', line 38

def estimate_tokens(text)
  (text.size.to_f / 3.5).ceil
end

#format_size(context_size) ⇒ String (private)

The format_size method converts a byte size value into a human-readable string with binary units.

This method takes a raw byte count and formats it using the Tins::Unit library to display the size with appropriate binary prefixes (KiB, MiB, etc.) and two decimal places.

Parameters:

  • context_size (Integer)

    the size in bytes to be formatted

Returns:

  • (String)

    the formatted size string with binary units



16
17
18
# File 'lib/context_spook/utils.rb', line 16

def format_size(context_size)
  Tins::Unit.format(context_size, format: '%.2f %U', unit: ?b, prefix: 1024)
end

#format_tokens(tokens) ⇒ String (private)

The format_tokens method converts a token count into a human-readable string using SI prefixes (e.g., 1.2 kT).

Parameters:

  • tokens (Integer)

    the number of tokens to be formatted

Returns:

  • (String)

    the formatted token string



26
27
28
# File 'lib/context_spook/utils.rb', line 26

def format_tokens(tokens)
  Tins::Unit.format(tokens, unit: ?T, prefix: :si_uc, format: '%.1f %U')
end