Class: OllamaChat::Utils::Fetcher

Inherits:
Object
  • Object
show all
Defined in:
lib/ollama_chat/utils/fetcher.rb

Overview

A fetcher implementation that handles retrieval and caching of HTTP resources.

This class provides functionality to fetch content from URLs, with support for caching responses and their metadata. It handles various content types and integrates with different cache backends to improve performance by avoiding redundant network requests.

Examples:

Fetching content from a URL with caching

fetcher = OllamaChat::Utils::Fetcher.new(cache: redis_cache)
fetcher.get('https://example.com/data.json') do |tmp|
  # Process the fetched content
end

Defined Under Namespace

Modules: HeaderExtension Classes: RetryWithoutStreaming

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(debug: false, http_options: {}) ⇒ Fetcher

The initialize method sets up the fetcher instance with debugging and HTTP configuration options.

Parameters:

  • debug (TrueClass, FalseClass) (defaults to: false)

    enables or disables debug output

  • http_options (Hash) (defaults to: {})

    additional options to pass to the HTTP client



181
182
183
184
185
186
# File 'lib/ollama_chat/utils/fetcher.rb', line 181

def initialize(debug: false, http_options: {})
  @debug        = debug
  @started      = false
  @streaming    = true
  @http_options = http_options
end

Class Method Details

.execute(command) {|tmpfile| ... } ⇒ Object

The execute method runs a shell command and processes its output.

It captures the command’s standard output and error streams, writes them to a temporary file, and yields the file to the caller. If an exception occurs during execution, it reports the error and yields a failed temporary file instead.

Parameters:

  • command (String)

    the shell command to execute

Yields:

  • (tmpfile)


153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
# File 'lib/ollama_chat/utils/fetcher.rb', line 153

def self.execute(command, &block)
  Tempfile.open do |tmp|
    unless command =~ /2>&1/
      command += ' 2>&1'
    end
    IO.popen(command) do |io|
      until io.eof?
        tmp.write io.read(1 << 14)
      end
      tmp.rewind
      tmp.extend(OllamaChat::Utils::Fetcher::HeaderExtension)
      tmp.content_type = MIME::Types['text/plain'].first
      block.(tmp)
    end
  end
rescue => e
  STDERR.puts "Cannot execute #{command.inspect} (#{e})"
  if @debug && !e.is_a?(RuntimeError)
    STDERR.puts "#{e.backtrace * ?\n}"
  end
  yield HeaderExtension.failed
end

.get(url, headers: {}, **options) {|tmp| ... } ⇒ Object?

The get method retrieves content from a URL, using caching when available. It processes the URL with optional headers and additional options, then yields a temporary file containing the retrieved content. If caching is enabled and content is found in the cache, it returns the cached result instead of fetching again. The method handles both cached and fresh fetches, ensuring that cache is updated when new content is retrieved.

Parameters:

  • url (String)

    the URL to fetch content from

  • headers (Hash) (defaults to: {})

    optional headers to include in the request

  • options (Hash)

    additional options for the fetch operation

Yields:

  • (tmp)

Returns:

  • (Object)

    the result of the block execution

  • (nil)

    if no block is given or if the fetch fails



90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
# File 'lib/ollama_chat/utils/fetcher.rb', line 90

def self.get(url, headers: {}, **options, &block)
  cache = options.delete(:cache) and
    cache = OllamaChat::Utils::CacheFetcher.new(cache)
  cache and infobar.puts "Getting #{url.to_s.inspect} via cache…"
  if result = cache&.get(url, &block)
    content_type = result&.content_type || 'unknown'
    infobar.puts "…hit, found #{content_type} content in cache."
    return result
  else
    new(**options).send(:get, url, headers:) do |tmp|
      result = block.(tmp)
      if cache && !tmp.is_a?(StringIO)
        tmp.rewind
        cache.put(url, tmp)
      end
      result
    end
  end
end

.normalize_url(url) ⇒ Object

The normalize_url method processes a URL by converting it to a string, decoding any URI components, removing anchors, and then escaping the URL to ensure it is properly formatted.



113
114
115
116
117
118
# File 'lib/ollama_chat/utils/fetcher.rb', line 113

def self.normalize_url(url)
  url = url.to_s
  url = URI.decode_uri_component(url)
  url = url.sub(/#.*/, '')
  URI::Parser.new.escape(url).to_s
end

.read(filename) {|file| ... } ⇒ nil, Object

The read method opens a file and extends it with header extension metadata. It then yields the file to the provided block for processing. If the file does not exist, it outputs an error message to standard error.

exists

Parameters:

  • filename (String)

    the path to the file to be read

Yields:

  • (file)

    yields the opened file with header extension

Returns:

  • (nil)

    returns nil if the file does not exist

  • (Object)

    returns the result of the block execution if the file



131
132
133
134
135
136
137
138
139
140
141
# File 'lib/ollama_chat/utils/fetcher.rb', line 131

def self.read(filename, &block)
  if File.exist?(filename)
    File.open(filename) do |file|
      file.extend(OllamaChat::Utils::Fetcher::HeaderExtension)
      file.content_type = MIME::Types.type_for(filename).first
      block.(file)
    end
  else
    STDERR.puts "File #{filename.to_s.inspect} doesn't exist."
  end
end

Instance Method Details

#callback(tmp) ⇒ Proc (private)

The callback method creates a proc that handles chunked data processing by updating progress information and writing chunks to a temporary file.

Parameters:

  • tmp (Tempfile)

    the temporary file to which data chunks are written

Returns:

  • (Proc)

    a proc that accepts chunk, remaining_bytes, and total_bytes parameters for processing streamed data



310
311
312
313
314
315
316
317
318
319
320
321
322
323
# File 'lib/ollama_chat/utils/fetcher.rb', line 310

def callback(tmp)
  -> chunk, remaining_bytes, total_bytes do
    total   = total_bytes or next
    current = total_bytes - remaining_bytes
    if @started
      infobar.counter.progress(by: total - current)
    else
      @started = true
      infobar.counter.reset(total:, current:)
    end
    infobar.update(message: message(current, total), force: true)
    tmp.print(chunk)
  end
end

#decorate_io(tmp, response) ⇒ Object (private)

Decorates a temporary IO object with header information from an HTTP response.

This method extends the given temporary IO object with HeaderExtension module and populates it with content type and cache expiration information extracted from the provided response headers.

Parameters:

  • tmp (IO)

    The temporary IO object to decorate (typically a file handle)

  • response (Object)

    HTTP response object containing headers

Options Hash (response):

  • :headers (Hash)

    HTTP headers hash



289
290
291
292
293
294
295
296
297
298
299
300
301
# File 'lib/ollama_chat/utils/fetcher.rb', line 289

def decorate_io(tmp, response)
  tmp.rewind
  tmp.extend(HeaderExtension)
  if content_type = MIME::Types[response.headers['content-type']].first
    tmp.content_type = content_type
  end
  if cache_control = response.headers['cache-control'] and
      cache_control !~ /no-store|no-cache/ and
      ex = cache_control[/s-maxage\s*=\s*(\d+)/, 1] || cache_control[/max-age\s*=\s*(\d+)/, 1]
  then
    tmp.ex = ex.to_i
  end
end

#excon(url, **options) ⇒ Excon (private)

The excon method creates a new Excon client instance configured with the specified URL and options.

Parameters:

  • url (String)

    the URL to be used for the Excon client

  • options (Hash)

    additional options to be merged with http_options

Returns:

  • (Excon)

    a new Excon client instance

See Also:

  • #normalize_url
  • #http_options


200
201
202
203
# File 'lib/ollama_chat/utils/fetcher.rb', line 200

def excon(url, **options)
  url = self.class.normalize_url(url)
  Excon.new(url, options.merge(@http_options))
end

#get(url, headers: {}) {|Tempfile| ... } ⇒ Object (private)

Makes an HTTP GET request to the specified URL with optional headers and processing block.

This method handles both streaming and non-streaming HTTP requests, using Excon for the actual HTTP communication. The response body is written to a temporary file which is then decorated with additional behavior before being passed to the provided block.

Parameters:

  • url (String)

    The URL to make the GET request to

  • headers (Hash) (defaults to: {})

    Optional headers to include in the request (keys will be converted to strings)

Yields:

  • (Tempfile)

    The temporary file containing the response body, after decoration



218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
# File 'lib/ollama_chat/utils/fetcher.rb', line 218

def get(url, headers: {}, &block)
  headers |= self.headers
  headers = headers.transform_keys(&:to_s)
  response = nil
  Tempfile.open do |tmp|
    infobar.label = 'Getting'
    if @streaming
      response = excon(url, headers:, response_block: callback(tmp)).request(method: :get)
      response.status != 200 || !@started and raise RetryWithoutStreaming
      decorate_io(tmp, response)
      infobar.finish
      block.(tmp)
    else
      response = excon(url, headers:, middlewares:).request(method: :get)
      if response.status != 200
        raise "invalid response status code"
      end
      body = response.body
      tmp.print body
      infobar.update(message: message(body.size, body.size), force: true)
      decorate_io(tmp, response)
      infobar.finish
      block.(tmp)
    end
  end
rescue RetryWithoutStreaming
  @streaming = false
  retry
rescue => e
  STDERR.puts "Cannot get #{url.to_s.inspect} (#{e}): #{response&.status_line || 'n/a'}"
  if @debug && !e.is_a?(RuntimeError)
    STDERR.puts "#{e.backtrace * ?\n}"
  end
  yield HeaderExtension.failed
end

#headersHash (private)

Note:

The returned hash includes the ‘User-Agent’ header set to OllamaChat::Chat.user_agent.

The headers method returns a hash containing the default HTTP headers that should be used for requests, including a User-Agent header configured with the application’s user agent string.

Returns:

  • (Hash)

    a hash mapping header names to their values



261
262
263
264
265
# File 'lib/ollama_chat/utils/fetcher.rb', line 261

def headers
  {
    'User-Agent' => OllamaChat::Chat.user_agent,
  }
end

#message(current, total) ⇒ String (private)

The message method formats progress information by combining current and total values with unit formatting, along with timing details.

Parameters:

  • current (Integer)

    the current progress value

  • total (Integer)

    the total progress value

Returns:

  • (String)

    a formatted progress string including units and timing information



332
333
334
335
336
337
# File 'lib/ollama_chat/utils/fetcher.rb', line 332

def message(current, total)
  progress = '%s/%s' % [ current, total ].map {
    Tins::Unit.format(_1, format: '%.2f %U')
  }
  '%l ' + progress + ' in %te, ETA %e @%E'
end

#middlewaresArray (private)

The middlewares method returns the combined array of default Excon middlewares and the RedirectFollower middleware, ensuring there are no duplicates.

Returns:

  • (Array)

    an array of middleware classes including RedirectFollower deduplicated from the default Excon middlewares.



273
274
275
# File 'lib/ollama_chat/utils/fetcher.rb', line 273

def middlewares
  (Excon.defaults[:middlewares] + [ Excon::Middleware::RedirectFollower ]).uniq
end