Skip to content

Commit eb51dfd

Browse files
authored
Merge pull request #436 from ThomasSevestre/optional_access_token
Allow usage of offline models with Ollama
2 parents 54043a7 + d74cc9d commit eb51dfd

File tree

9 files changed

+145
-33
lines changed

9 files changed

+145
-33
lines changed

Gemfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ gemspec
66
gem "byebug", "~> 11.1.3"
77
gem "dotenv", "~> 2.8.1"
88
gem "rake", "~> 13.1"
9-
gem "rspec", "~> 3.12"
9+
gem "rspec", "~> 3.13"
1010
gem "rubocop", "~> 1.50.2"
1111
gem "vcr", "~> 6.1.0"
1212
gem "webmock", "~> 3.19.1"

Gemfile.lock

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ GEM
1616
byebug (11.1.3)
1717
crack (0.4.5)
1818
rexml
19-
diff-lcs (1.5.0)
19+
diff-lcs (1.5.1)
2020
dotenv (2.8.1)
2121
event_stream_parser (1.0.0)
2222
faraday (2.8.1)
@@ -37,19 +37,19 @@ GEM
3737
rake (13.1.0)
3838
regexp_parser (2.8.0)
3939
rexml (3.2.6)
40-
rspec (3.12.0)
41-
rspec-core (~> 3.12.0)
42-
rspec-expectations (~> 3.12.0)
43-
rspec-mocks (~> 3.12.0)
44-
rspec-core (3.12.0)
45-
rspec-support (~> 3.12.0)
46-
rspec-expectations (3.12.2)
40+
rspec (3.13.0)
41+
rspec-core (~> 3.13.0)
42+
rspec-expectations (~> 3.13.0)
43+
rspec-mocks (~> 3.13.0)
44+
rspec-core (3.13.0)
45+
rspec-support (~> 3.13.0)
46+
rspec-expectations (3.13.0)
4747
diff-lcs (>= 1.2.0, < 2.0)
48-
rspec-support (~> 3.12.0)
49-
rspec-mocks (3.12.3)
48+
rspec-support (~> 3.13.0)
49+
rspec-mocks (3.13.0)
5050
diff-lcs (>= 1.2.0, < 2.0)
51-
rspec-support (~> 3.12.0)
52-
rspec-support (3.12.0)
51+
rspec-support (~> 3.13.0)
52+
rspec-support (3.13.1)
5353
rubocop (1.50.2)
5454
json (~> 2.3)
5555
parallel (~> 1.10)
@@ -78,7 +78,7 @@ DEPENDENCIES
7878
byebug (~> 11.1.3)
7979
dotenv (~> 2.8.1)
8080
rake (~> 13.1)
81-
rspec (~> 3.12)
81+
rspec (~> 3.13)
8282
rubocop (~> 1.50.2)
8383
ruby-openai!
8484
vcr (~> 6.1.0)

README.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Stream text with GPT-4, transcribe and translate audio with Whisper, or create i
2424
- [Extra Headers per Client](#extra-headers-per-client)
2525
- [Verbose Logging](#verbose-logging)
2626
- [Azure](#azure)
27+
- [Ollama](#ollama)
2728
- [Counting Tokens](#counting-tokens)
2829
- [Models](#models)
2930
- [Examples](#examples)
@@ -191,6 +192,38 @@ To use the [Azure OpenAI Service](https://learn.microsoft.com/en-us/azure/cognit
191192

192193
where `AZURE_OPENAI_URI` is e.g. `https://custom-domain.openai.azure.com/openai/deployments/gpt-35-turbo`
193194

195+
#### Ollama
196+
197+
Ollama allows you to run open-source LLMs, such as Llama 3, locally. It [offers chat compatibility](https://github.com/ollama/ollama/blob/main/docs/openai.md) with the OpenAI API.
198+
199+
You can download Ollama [here](https://ollama.com/). On macOS you can install and run Ollama like this:
200+
201+
```bash
202+
brew install ollama
203+
ollama serve
204+
ollama pull llama3:latest # In new terminal tab.
205+
```
206+
207+
Create a client using your Ollama server and the pulled model, and stream a conversation for free:
208+
209+
```ruby
210+
client = OpenAI::Client.new(
211+
uri_base: "http://localhost:11434"
212+
)
213+
214+
client.chat(
215+
parameters: {
216+
model: "llama3", # Required.
217+
messages: [{ role: "user", content: "Hello!"}], # Required.
218+
temperature: 0.7,
219+
stream: proc do |chunk, _bytesize|
220+
print chunk.dig("choices", 0, "delta", "content")
221+
end
222+
})
223+
224+
# => Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
225+
```
226+
194227
### Counting Tokens
195228

196229
OpenAI parses prompt text into [tokens](https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them), which are words or portions of words. (These tokens are unrelated to your API access_token.) Counting tokens can help you estimate your [costs](https://openai.com/pricing). It can also help you ensure your prompt text size is within the max-token limits of your model's context window, and choose an appropriate [`max_tokens`](https://platform.openai.com/docs/api-reference/chat/create#chat/create-max_tokens) completion parameter so your response will fit as well.

lib/openai.rb

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,8 @@ def call(env)
3636
end
3737

3838
class Configuration
39-
attr_writer :access_token
40-
attr_accessor :api_type, :api_version, :organization_id, :uri_base, :request_timeout,
41-
:extra_headers
39+
attr_accessor :access_token, :api_type, :api_version, :organization_id,
40+
:uri_base, :request_timeout, :extra_headers
4241

4342
DEFAULT_API_VERSION = "v1".freeze
4443
DEFAULT_URI_BASE = "https://api.openai.com/".freeze
@@ -53,13 +52,6 @@ def initialize
5352
@request_timeout = DEFAULT_REQUEST_TIMEOUT
5453
@extra_headers = {}
5554
end
56-
57-
def access_token
58-
return @access_token if @access_token
59-
60-
error_text = "OpenAI access token missing! See https://github.com/alexrudall/ruby-openai#usage"
61-
raise ConfigurationError, error_text
62-
end
6355
end
6456

6557
class << self

spec/fixtures/cassettes/llama3_chat.yml

Lines changed: 67 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

spec/openai/client/chat_spec.rb

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,9 @@
33
context "with messages", :vcr do
44
let(:messages) { [{ role: "user", content: "Hello!" }] }
55
let(:stream) { false }
6+
let(:uri_base) { nil }
67
let(:response) do
7-
OpenAI::Client.new.chat(
8+
OpenAI::Client.new({ uri_base: uri_base }).chat(
89
parameters: parameters
910
)
1011
end
@@ -172,6 +173,23 @@ def call(chunk)
172173
end
173174
end
174175
end
176+
177+
context "with Ollama + model: llama3" do
178+
let(:uri_base) { "http://localhost:11434" }
179+
let(:model) { "llama3" }
180+
181+
it "succeeds" do
182+
VCR.use_cassette(cassette) do
183+
vcr_skip do
184+
Faraday.new(url: uri_base).get
185+
rescue Faraday::ConnectionFailed
186+
pending "This test needs `ollama serve` running locally with #{model} installed"
187+
end
188+
189+
expect(content.split.empty?).to eq(false)
190+
end
191+
end
192+
end
175193
end
176194
end
177195
end

spec/openai_spec.rb

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -29,14 +29,6 @@
2929
expect(OpenAI.configuration.extra_headers).to eq(extra_headers)
3030
end
3131

32-
context "without an access token" do
33-
let(:access_token) { nil }
34-
35-
it "raises an error" do
36-
expect { OpenAI::Client.new.chat }.to raise_error(OpenAI::ConfigurationError)
37-
end
38-
end
39-
4032
context "with custom timeout and uri base" do
4133
before do
4234
OpenAI.configure do |config|

spec/spec_helper.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,8 @@
4949
end
5050
end
5151
end
52+
53+
c.include VCRHelpers
5254
end
5355

5456
RSPEC_ROOT = File.dirname __FILE__

spec/support/vcr_skip.rb

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
module VCRHelpers
2+
def vcr_skip
3+
VCR.configure { |c| c.allow_http_connections_when_no_cassette = true }
4+
yield
5+
ensure
6+
VCR.configure { |c| c.allow_http_connections_when_no_cassette = false }
7+
end
8+
end

0 commit comments

Comments
 (0)