Module: OllamaChat::WebSearching

Included in:
Chat
Defined in:
lib/ollama_chat/web_searching.rb

Overview

A module that provides web search functionality for OllamaChat.

The WebSearching module encapsulates the logic for performing web searches using configured search engines. It handles query construction information integration, and delegates to engine-specific implementations for retrieving search results. The module supports multiple search engines including SearxNG and DuckDuckGo, making it flexible for different deployment scenarios and privacy preferences.

Examples:

Performing a web search

chat.search_web('ruby programming tutorials', 5)

Instance Method Summary collapse

Instance Method Details

The manage_links method handles operations on a collection of links, such as displaying them or clearing specific entries.

It supports two main commands: 'clear' and nil (default). When the command is 'clear', it presents an interactive menu to either clear all links or individual links. When the command is nil, it displays the current list of links with hyperlinks.

Parameters:

  • command (String, nil)

    the operation to perform on the links



105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
# File 'lib/ollama_chat/web_searching.rb', line 105

def manage_links(command)
  case command
  when 'clear'
    choose_with_state do
      loop do
        links_options = links.to_a.unshift('[ALL]').unshift('[EXIT]')
        link = choose_entry(links_options, prompt: 'Clear? %s')
        case link
        when nil, '[EXIT]'
          STDOUT.puts "Exiting chooser."
          break
        when '[ALL]'
          if confirm?(prompt: 'šŸ”” Are you sure? (y/n) ', yes: /\Ay/i)
            links.clear
            STDOUT.puts "Cleared all links in list."
            break
          else
            STDOUT.puts 'Cancelled.'
            confirm?(prompt: "\nāŽ  Press any key to continue (%s). ", timeout: 3)
          end
        when /./
          links.delete(link)
        end
      end
    end
  when nil
    if links.empty?
      STDOUT.puts "List is empty."
    else
      w       = Math.log10(links.size + 1).ceil
      format  = "%#{w}s. %s"
      connect = -> link { hyperlink(link) { link } }
      STDOUT.puts links.each_with_index.map { |x, i| format % [ i + 1, connect.(x) ] }
    end
  end
end

#search_engineString (private)

The search_engine method returns the currently configured web search engine to be used for online searches.

Returns:

  • (String)

    the name of the web search engine



148
149
150
# File 'lib/ollama_chat/web_searching.rb', line 148

def search_engine
  config.web_search.use
end

#search_web(query, n = nil) ⇒ Array<String>?

The search_web method performs a web search using the configured search engine. It limits the number of results. The method delegates to engine-specific search methods based on the configured search engine.

Parameters:

  • query (String)

    the search query string

  • n (Integer) (defaults to: nil)

    the maximum number of results to return

Returns:

  • (Array<String>, nil)

    an array of URLs from the search results or nil if the search engine is not implemented



24
25
26
27
28
29
30
31
32
33
34
35
36
# File 'lib/ollama_chat/web_searching.rb', line 24

def search_web(query, n = nil)
  n     = n.to_i.clamp(1..)
  query = URI.encode_uri_component(query)
  search_command = :"search_web_with_#{search_engine}"
  if respond_to?(search_command, true)
    send(search_command, query, n).tap do |results|
      results.each { |url| links.add(url) }
    end
  else
    STDOUT.puts "Search engine #{bold{search_engine}} not implemented!"
    nil
  end
end

#search_web_with_duckduckgo(query, n) ⇒ Array<String> (private)

The search_web_with_duckduckgo method performs a web search using the DuckDuckGo search engine and extracts URLs from the search results.

Parameters:

  • query (String)

    the search query string to be used

  • n (Integer)

    the maximum number of URLs to extract from the search results

Returns:

  • (Array<String>)

    an array of URL strings extracted from the search results



176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
# File 'lib/ollama_chat/web_searching.rb', line 176

def search_web_with_duckduckgo(query, n)
  url = config.web_search.engines.duckduckgo.url % { query: }
  get_url(url, cache:) do |tmp|
    result = []
    doc = Nokogiri::HTML(tmp)
    doc.css('.results_links').each do |link|
      if n > 0
        url = link.css('.result__a').first&.[]('href')
        url.sub!(%r(\A(//duckduckgo\.com)?/l/\?uddg=), '')
        url.sub!(%r(&rut=.*), '')
        url = URI.decode_uri_component(url)
        url = URI.parse(url)
        url.host =~ /duckduckgo\.com/ and next
        url = url.to_s
        result << url
        n -= 1
      else
        break
      end
    end
    result
  end
end

#search_web_with_searxng(query, n) ⇒ Array<String> (private)

The search_web_with_searxng method performs a web search using the SearxNG engine and returns the URLs of the first n search results.

Parameters:

  • query (String)

    the search query string

  • n (Integer)

    the number of search results to return

Returns:

  • (Array<String>)

    an array of URLs from the search results



159
160
161
162
163
164
165
# File 'lib/ollama_chat/web_searching.rb', line 159

def search_web_with_searxng(query, n)
  url = config.web_search.engines.searxng.url % { query: }
  get_url(url, cache:) do |tmp|
    data = JSON.parse(tmp.read, object_class: JSON::GenericObject)
    data.results.first(n).map(&:url)
  end
end

#web(count, query) ⇒ String, Symbol

Performs a web search and processes the results based on document processing configuration.

Searches for the given query using the configured search engine and processes up to the specified number of URLs. The processing approach varies based on the current document policy and embedding status:

  • Embedding mode: When document_policy.selected == 'embedding' AND @embedding.on? is true, each result is embedded and the query is interpolated into the web_embed prompt.
  • Summarizing mode: When document_policy.selected == 'summarizing', each result is summarized and both query and results are interpolated into the web_summarize prompt.
  • Importing mode: For all other cases, each result is imported and both query and results are interpolated into the web_import prompt.

Examples:

Basic web search

web('3', 'ruby programming tutorials')

Web search with embedding policy

# With document_policy.selected == 'embedding' and @embedding.on?
# Processes results through embedding pipeline

Web search with summarizing policy

# With document_policy.selected == 'summarizing'
# Processes results through summarization pipeline

Parameters:

  • count (String)

    The maximum number of search results to process (defaults to 1)

  • query (String)

    The search query string

Returns:

  • (String, Symbol)

    The interpolated prompt content when successful, or :next if no URLs were found or processing failed



68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/ollama_chat/web_searching.rb', line 68

def web(count, query)
  urls = search_web(query, count.to_i) or return :next
  if document_policy.selected == 'embedding' && @embedding.on?
    prompt = prompt(:web_embed).to_s
    urls.each do |url|
      fetch_source(url) { |url_io| embed_source(url_io, url) }
    end
    prompt.named_placeholders_interpolate({query:})
  elsif document_policy.selected == 'summarizing'
    prompt = prompt(:web_import).to_s
    results = urls.each_with_object('') do |url, content|
      summarize(url).full? do |c|
        content << c.ask_and_send_or_self(:read)
      end
    end
    prompt.named_placeholders_interpolate({query:, results:})
  else
    prompt = prompt(:web_summarize).to_s
    results = urls.each_with_object('') do |url, content|
      import(url).full? do |c|
        content << c.ask_and_send_or_self(:read)
      end
    end
    prompt.named_placeholders_interpolate({query:, results:})
  end
end