Arrcttacsrks commited on
Commit
09fb5c6
·
verified ·
1 Parent(s): 16c390a

Upload llama.cpp/Makefile with huggingface_hub

Browse files
Files changed (1) hide show
  1. llama.cpp/Makefile +1697 -0
llama.cpp/Makefile ADDED
@@ -0,0 +1,1697 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Define the default target now so that it is always the first target
2
+ BUILD_TARGETS = \
3
+ libllava.a \
4
+ llama-batched \
5
+ llama-batched-bench \
6
+ llama-bench \
7
+ llama-cli \
8
+ llama-convert-llama2c-to-ggml \
9
+ llama-embedding \
10
+ llama-eval-callback \
11
+ llama-export-lora \
12
+ llama-gbnf-validator \
13
+ llama-gguf \
14
+ llama-gguf-hash \
15
+ llama-gguf-split \
16
+ llama-gritlm \
17
+ llama-imatrix \
18
+ llama-infill \
19
+ llama-llava-cli \
20
+ llama-minicpmv-cli\
21
+ llama-lookahead \
22
+ llama-lookup \
23
+ llama-lookup-create \
24
+ llama-lookup-merge \
25
+ llama-lookup-stats \
26
+ llama-parallel \
27
+ llama-passkey \
28
+ llama-perplexity \
29
+ llama-q8dot \
30
+ llama-quantize \
31
+ llama-quantize-stats \
32
+ llama-retrieval \
33
+ llama-save-load-state \
34
+ llama-server \
35
+ llama-simple \
36
+ llama-simple-chat \
37
+ llama-speculative \
38
+ llama-tokenize \
39
+ llama-vdot \
40
+ llama-cvector-generator \
41
+ llama-gen-docs \
42
+ tests/test-c.o
43
+
44
+ # Binaries only useful for tests
45
+ TEST_TARGETS = \
46
+ tests/test-arg-parser \
47
+ tests/test-autorelease \
48
+ tests/test-backend-ops \
49
+ tests/test-chat-template \
50
+ tests/test-double-float \
51
+ tests/test-grad0 \
52
+ tests/test-grammar-integration \
53
+ tests/test-grammar-parser \
54
+ tests/test-json-schema-to-grammar \
55
+ tests/test-llama-grammar \
56
+ tests/test-log \
57
+ tests/test-model-load-cancel \
58
+ tests/test-quantize-fns \
59
+ tests/test-quantize-perf \
60
+ tests/test-rope \
61
+ tests/test-sampling \
62
+ tests/test-tokenizer-0 \
63
+ tests/test-tokenizer-1-bpe \
64
+ tests/test-tokenizer-1-spm
65
+ # tests/test-opt \
66
+
67
+ # Legacy build targets that were renamed in #7809, but should still be removed when the project is cleaned
68
+ LEGACY_TARGETS_CLEAN = main quantize quantize-stats perplexity imatrix embedding vdot q8dot convert-llama2c-to-ggml \
69
+ simple batched batched-bench save-load-state server gguf gguf-split eval-callback llama-bench libllava.a llava-cli baby-llama \
70
+ retrieval speculative infill tokenize parallel export-lora lookahead lookup passkey gritlm
71
+
72
+ # Legacy build targets that were renamed in #7809, but we want to build binaries that for them that output a deprecation warning if people try to use them.
73
+ # We don't want to clutter things too much, so we only build replacements for the most commonly used binaries.
74
+ LEGACY_TARGETS_BUILD = main quantize perplexity embedding server
75
+
76
+ # Deprecation aliases
77
+ ifdef LLAMA_CUBLAS
78
+ $(error LLAMA_CUBLAS is removed. Use GGML_CUDA instead.)
79
+ endif
80
+
81
+ ifdef LLAMA_CUDA
82
+ GGML_CUDA := 1
83
+ DEPRECATE_WARNING := 1
84
+ endif
85
+
86
+ ifdef LLAMA_KOMPUTE
87
+ GGML_KOMPUTE := 1
88
+ DEPRECATE_WARNING := 1
89
+ endif
90
+
91
+ ifdef LLAMA_METAL
92
+ GGML_METAL := 1
93
+ DEPRECATE_WARNING := 1
94
+ endif
95
+
96
+ ifdef LLAMA_RPC
97
+ GGML_RPC := 1
98
+ DEPRECATE_WARNING := 1
99
+ endif
100
+
101
+ ifdef LLAMA_SYCL
102
+ GGML_SYCL := 1
103
+ DEPRECATE_WARNING := 1
104
+ endif
105
+
106
+ ifdef LLAMA_SYCL_F16
107
+ GGML_SYCL_F16 := 1
108
+ DEPRECATE_WARNING := 1
109
+ endif
110
+
111
+ ifdef LLAMA_OPENBLAS
112
+ GGML_OPENBLAS := 1
113
+ DEPRECATE_WARNING := 1
114
+ endif
115
+
116
+ ifdef LLAMA_OPENBLAS64
117
+ GGML_OPENBLAS64 := 1
118
+ DEPRECATE_WARNING := 1
119
+ endif
120
+
121
+ ifdef LLAMA_BLIS
122
+ GGML_BLIS := 1
123
+ DEPRECATE_WARNING := 1
124
+ endif
125
+
126
+ ifdef LLAMA_NO_LLAMAFILE
127
+ GGML_NO_LLAMAFILE := 1
128
+ DEPRECATE_WARNING := 1
129
+ endif
130
+
131
+ ifdef LLAMA_NO_ACCELERATE
132
+ GGML_NO_ACCELERATE := 1
133
+ DEPRECATE_WARNING := 1
134
+ endif
135
+
136
+ ifdef LLAMA_NO_OPENMP
137
+ GGML_NO_OPENMP := 1
138
+ DEPRECATE_WARNING := 1
139
+ endif
140
+
141
+ ifdef LLAMA_NO_METAL
142
+ GGML_NO_METAL := 1
143
+ DEPRECATE_WARNING := 1
144
+ endif
145
+
146
+ ifdef LLAMA_DISABLE_LOGS
147
+ REMOVE_WARNING := 1
148
+ endif
149
+
150
+ ifdef LLAMA_SERVER_VERBOSE
151
+ REMOVE_WARNING := 1
152
+ endif
153
+
154
+ ifndef UNAME_S
155
+ UNAME_S := $(shell uname -s)
156
+ endif
157
+
158
+ ifndef UNAME_P
159
+ UNAME_P := $(shell uname -p)
160
+ endif
161
+
162
+ ifndef UNAME_M
163
+ UNAME_M := $(shell uname -m)
164
+ endif
165
+
166
+ # In GNU make default CXX is g++ instead of c++. Let's fix that so that users
167
+ # of non-gcc compilers don't have to provide g++ alias or wrapper.
168
+ DEFCC := cc
169
+ DEFCXX := c++
170
+ ifeq ($(origin CC),default)
171
+ CC := $(DEFCC)
172
+ endif
173
+ ifeq ($(origin CXX),default)
174
+ CXX := $(DEFCXX)
175
+ endif
176
+
177
+ # Mac OS + Arm can report x86_64
178
+ # ref: https://github.com/ggerganov/whisper.cpp/issues/66#issuecomment-1282546789
179
+ ifeq ($(UNAME_S),Darwin)
180
+ ifndef GGML_NO_METAL
181
+ GGML_METAL := 1
182
+ endif
183
+
184
+ GGML_NO_OPENMP := 1
185
+
186
+ ifneq ($(UNAME_P),arm)
187
+ SYSCTL_M := $(shell sysctl -n hw.optional.arm64 2>/dev/null)
188
+ ifeq ($(SYSCTL_M),1)
189
+ # UNAME_P := arm
190
+ # UNAME_M := arm64
191
+ warn := $(warning Your arch is announced as x86_64, but it seems to actually be ARM64. Not fixing that can lead to bad performance. For more info see: https://github.com/ggerganov/whisper.cpp/issues/66\#issuecomment-1282546789)
192
+ endif
193
+ endif
194
+ endif
195
+
196
+ ifdef GGML_METAL
197
+ GGML_METAL_EMBED_LIBRARY := 1
198
+ endif
199
+
200
+ ifdef GGML_RPC
201
+ BUILD_TARGETS += rpc-server
202
+ endif
203
+
204
+ ifdef GGML_VULKAN
205
+ BUILD_TARGETS += vulkan-shaders-gen
206
+ endif
207
+
208
+ default: $(BUILD_TARGETS) $(LEGACY_TARGETS_BUILD)
209
+
210
+ test: $(TEST_TARGETS)
211
+ @failures=0; \
212
+ for test_target in $(TEST_TARGETS); do \
213
+ if [ "$$test_target" = "tests/test-tokenizer-0" ]; then \
214
+ ./$$test_target $(CURDIR)/models/ggml-vocab-llama-spm.gguf; \
215
+ ./$$test_target $(CURDIR)/models/ggml-vocab-llama-bpe.gguf; \
216
+ ./$$test_target $(CURDIR)/models/ggml-vocab-phi-3.gguf; \
217
+ ./$$test_target $(CURDIR)/models/ggml-vocab-falcon.gguf; \
218
+ ./$$test_target $(CURDIR)/models/ggml-vocab-bert-bge.gguf; \
219
+ ./$$test_target $(CURDIR)/models/ggml-vocab-starcoder.gguf; \
220
+ ./$$test_target $(CURDIR)/models/ggml-vocab-gpt-2.gguf; \
221
+ ./$$test_target $(CURDIR)/models/ggml-vocab-refact.gguf; \
222
+ elif [ "$$test_target" = "tests/test-tokenizer-1-spm" ]; then \
223
+ continue; \
224
+ elif [ "$$test_target" = "tests/test-tokenizer-1-bpe" ]; then \
225
+ continue; \
226
+ else \
227
+ echo "Running test $$test_target..."; \
228
+ ./$$test_target; \
229
+ fi; \
230
+ if [ $$? -ne 0 ]; then \
231
+ printf 'Test %s FAILED!\n\n' $$test_target; \
232
+ failures=$$(( failures + 1 )); \
233
+ else \
234
+ printf 'Test %s passed.\n\n' $$test_target; \
235
+ fi; \
236
+ done; \
237
+ if [ $$failures -gt 0 ]; then \
238
+ printf '\n%s tests failed.\n' $$failures; \
239
+ exit 1; \
240
+ fi
241
+ @echo 'All tests passed.'
242
+
243
+ all: $(BUILD_TARGETS) $(TEST_TARGETS) $(LEGACY_TARGETS_BUILD)
244
+
245
+ ifdef RISCV_CROSS_COMPILE
246
+ CC := riscv64-unknown-linux-gnu-gcc
247
+ CXX := riscv64-unknown-linux-gnu-g++
248
+ endif
249
+
250
+ #
251
+ # Compile flags
252
+ #
253
+
254
+ # keep standard at C11 and C++11
255
+ MK_CPPFLAGS = -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon
256
+ MK_CFLAGS = -std=c11 -fPIC
257
+ MK_CXXFLAGS = -std=c++11 -fPIC
258
+ MK_NVCCFLAGS = -std=c++11
259
+
260
+ ifdef LLAMA_NO_CCACHE
261
+ GGML_NO_CCACHE := 1
262
+ DEPRECATE_WARNING := 1
263
+ endif
264
+
265
+ ifndef GGML_NO_CCACHE
266
+ CCACHE := $(shell which ccache)
267
+ ifdef CCACHE
268
+ export CCACHE_SLOPPINESS = time_macros
269
+ $(info I ccache found, compilation results will be cached. Disable with GGML_NO_CCACHE.)
270
+ CC := $(CCACHE) $(CC)
271
+ CXX := $(CCACHE) $(CXX)
272
+ else
273
+ $(info I ccache not found. Consider installing it for faster compilation.)
274
+ endif # CCACHE
275
+ endif # GGML_NO_CCACHE
276
+
277
+ # clock_gettime came in POSIX.1b (1993)
278
+ # CLOCK_MONOTONIC came in POSIX.1-2001 / SUSv3 as optional
279
+ # posix_memalign came in POSIX.1-2001 / SUSv3
280
+ # M_PI is an XSI extension since POSIX.1-2001 / SUSv3, came in XPG1 (1985)
281
+ MK_CPPFLAGS += -D_XOPEN_SOURCE=600
282
+
283
+ # Somehow in OpenBSD whenever POSIX conformance is specified
284
+ # some string functions rely on locale_t availability,
285
+ # which was introduced in POSIX.1-2008, forcing us to go higher
286
+ ifeq ($(UNAME_S),OpenBSD)
287
+ MK_CPPFLAGS += -U_XOPEN_SOURCE -D_XOPEN_SOURCE=700
288
+ endif
289
+
290
+ # Data types, macros and functions related to controlling CPU affinity and
291
+ # some memory allocation are available on Linux through GNU extensions in libc
292
+ ifeq ($(UNAME_S),Linux)
293
+ MK_CPPFLAGS += -D_GNU_SOURCE
294
+ endif
295
+
296
+ # RLIMIT_MEMLOCK came in BSD, is not specified in POSIX.1,
297
+ # and on macOS its availability depends on enabling Darwin extensions
298
+ # similarly on DragonFly, enabling BSD extensions is necessary
299
+ ifeq ($(UNAME_S),Darwin)
300
+ MK_CPPFLAGS += -D_DARWIN_C_SOURCE
301
+ endif
302
+ ifeq ($(UNAME_S),DragonFly)
303
+ MK_CPPFLAGS += -D__BSD_VISIBLE
304
+ endif
305
+
306
+ # alloca is a non-standard interface that is not visible on BSDs when
307
+ # POSIX conformance is specified, but not all of them provide a clean way
308
+ # to enable it in such cases
309
+ ifeq ($(UNAME_S),FreeBSD)
310
+ MK_CPPFLAGS += -D__BSD_VISIBLE
311
+ endif
312
+ ifeq ($(UNAME_S),NetBSD)
313
+ MK_CPPFLAGS += -D_NETBSD_SOURCE
314
+ endif
315
+ ifeq ($(UNAME_S),OpenBSD)
316
+ MK_CPPFLAGS += -D_BSD_SOURCE
317
+ endif
318
+
319
+ ifdef GGML_SCHED_MAX_COPIES
320
+ MK_CPPFLAGS += -DGGML_SCHED_MAX_COPIES=$(GGML_SCHED_MAX_COPIES)
321
+ endif
322
+
323
+ ifdef LLAMA_DEBUG
324
+ MK_CFLAGS += -O0 -g
325
+ MK_CXXFLAGS += -O0 -g
326
+ MK_LDFLAGS += -g
327
+ MK_NVCCFLAGS += -O0 -g
328
+
329
+ ifeq ($(UNAME_S),Linux)
330
+ MK_CPPFLAGS += -D_GLIBCXX_ASSERTIONS
331
+ endif
332
+ else
333
+ MK_CPPFLAGS += -DNDEBUG
334
+ MK_CFLAGS += -O3 -g
335
+ MK_CXXFLAGS += -O3 -g
336
+ MK_NVCCFLAGS += -O3 -g
337
+ endif
338
+
339
+ ifdef LLAMA_SANITIZE_THREAD
340
+ MK_CFLAGS += -fsanitize=thread -g
341
+ MK_CXXFLAGS += -fsanitize=thread -g
342
+ MK_LDFLAGS += -fsanitize=thread -g
343
+ endif
344
+
345
+ ifdef LLAMA_SANITIZE_ADDRESS
346
+ MK_CFLAGS += -fsanitize=address -fno-omit-frame-pointer -g
347
+ MK_CXXFLAGS += -fsanitize=address -fno-omit-frame-pointer -g
348
+ MK_LDFLAGS += -fsanitize=address -fno-omit-frame-pointer -g
349
+ endif
350
+
351
+ ifdef LLAMA_SANITIZE_UNDEFINED
352
+ MK_CFLAGS += -fsanitize=undefined -g
353
+ MK_CXXFLAGS += -fsanitize=undefined -g
354
+ MK_LDFLAGS += -fsanitize=undefined -g
355
+ endif
356
+
357
+ ifdef LLAMA_SERVER_SSL
358
+ MK_CPPFLAGS += -DCPPHTTPLIB_OPENSSL_SUPPORT
359
+ MK_LDFLAGS += -lssl -lcrypto
360
+ endif
361
+
362
+ # warnings
363
+ WARN_FLAGS = \
364
+ -Wall \
365
+ -Wextra \
366
+ -Wpedantic \
367
+ -Wcast-qual \
368
+ -Wno-unused-function
369
+
370
+ MK_CFLAGS += \
371
+ $(WARN_FLAGS) \
372
+ -Wshadow \
373
+ -Wstrict-prototypes \
374
+ -Wpointer-arith \
375
+ -Wmissing-prototypes \
376
+ -Werror=implicit-int \
377
+ -Werror=implicit-function-declaration
378
+
379
+ MK_CXXFLAGS += \
380
+ $(WARN_FLAGS) \
381
+ -Wmissing-declarations \
382
+ -Wmissing-noreturn
383
+
384
+ ifeq ($(LLAMA_FATAL_WARNINGS),1)
385
+ MK_CFLAGS += -Werror
386
+ MK_CXXFLAGS += -Werror
387
+ endif
388
+
389
+ # this version of Apple ld64 is buggy
390
+ ifneq '' '$(findstring dyld-1015.7,$(shell $(CC) $(LDFLAGS) -Wl,-v 2>&1))'
391
+ MK_CPPFLAGS += -DHAVE_BUGGY_APPLE_LINKER
392
+ endif
393
+
394
+ # OS specific
395
+ # TODO: support Windows
396
+ ifneq '' '$(filter $(UNAME_S),Linux Darwin FreeBSD NetBSD OpenBSD Haiku)'
397
+ MK_CFLAGS += -pthread
398
+ MK_CXXFLAGS += -pthread
399
+ endif
400
+
401
+ # detect Windows
402
+ ifneq ($(findstring _NT,$(UNAME_S)),)
403
+ _WIN32 := 1
404
+ endif
405
+
406
+ # library name prefix
407
+ ifneq ($(_WIN32),1)
408
+ LIB_PRE := lib
409
+ endif
410
+
411
+ # Dynamic Shared Object extension
412
+ ifneq ($(_WIN32),1)
413
+ DSO_EXT := .so
414
+ else
415
+ DSO_EXT := .dll
416
+ endif
417
+
418
+ # Windows Sockets 2 (Winsock) for network-capable apps
419
+ ifeq ($(_WIN32),1)
420
+ LWINSOCK2 := -lws2_32
421
+ endif
422
+
423
+ ifdef LLAMA_GPROF
424
+ MK_CFLAGS += -pg
425
+ MK_CXXFLAGS += -pg
426
+ endif
427
+
428
+ # Architecture specific
429
+ # TODO: probably these flags need to be tweaked on some architectures
430
+ # feel free to update the Makefile for your architecture and send a pull request or issue
431
+
432
+ ifndef RISCV_CROSS_COMPILE
433
+
434
+ ifeq ($(UNAME_M),$(filter $(UNAME_M),x86_64 i686 amd64))
435
+ # Use all CPU extensions that are available:
436
+ MK_CFLAGS += -march=native -mtune=native
437
+ HOST_CXXFLAGS += -march=native -mtune=native
438
+
439
+ # Usage AVX-only
440
+ #MK_CFLAGS += -mfma -mf16c -mavx
441
+ #MK_CXXFLAGS += -mfma -mf16c -mavx
442
+
443
+ # Usage SSSE3-only (Not is SSE3!)
444
+ #MK_CFLAGS += -mssse3
445
+ #MK_CXXFLAGS += -mssse3
446
+ endif
447
+
448
+ ifneq '' '$(findstring mingw,$(shell $(CC) -dumpmachine))'
449
+ # The stack is only 16-byte aligned on Windows, so don't let gcc emit aligned moves.
450
+ # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412
451
+ # https://github.com/ggerganov/llama.cpp/issues/2922
452
+ MK_CFLAGS += -Xassembler -muse-unaligned-vector-move
453
+ MK_CXXFLAGS += -Xassembler -muse-unaligned-vector-move
454
+
455
+ # Target Windows 8 for PrefetchVirtualMemory
456
+ MK_CPPFLAGS += -D_WIN32_WINNT=0x602
457
+ endif
458
+
459
+ ifneq ($(filter aarch64%,$(UNAME_M)),)
460
+ # Apple M1, M2, etc.
461
+ # Raspberry Pi 3, 4, Zero 2 (64-bit)
462
+ # Nvidia Jetson
463
+ MK_CFLAGS += -mcpu=native
464
+ MK_CXXFLAGS += -mcpu=native
465
+ JETSON_RELEASE_INFO = $(shell jetson_release)
466
+ ifdef JETSON_RELEASE_INFO
467
+ ifneq ($(filter TX2%,$(JETSON_RELEASE_INFO)),)
468
+ JETSON_EOL_MODULE_DETECT = 1
469
+ CC = aarch64-unknown-linux-gnu-gcc
470
+ cxx = aarch64-unknown-linux-gnu-g++
471
+ endif
472
+ endif
473
+ endif
474
+
475
+ ifneq ($(filter armv6%,$(UNAME_M)),)
476
+ # Raspberry Pi 1, Zero
477
+ MK_CFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access
478
+ MK_CXXFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access
479
+ endif
480
+
481
+ ifneq ($(filter armv7%,$(UNAME_M)),)
482
+ # Raspberry Pi 2
483
+ MK_CFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
484
+ MK_CXXFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
485
+ endif
486
+
487
+ ifneq ($(filter armv8%,$(UNAME_M)),)
488
+ # Raspberry Pi 3, 4, Zero 2 (32-bit)
489
+ MK_CFLAGS += -mfp16-format=ieee -mno-unaligned-access
490
+ MK_CXXFLAGS += -mfp16-format=ieee -mno-unaligned-access
491
+ endif
492
+
493
+ ifneq ($(filter ppc64%,$(UNAME_M)),)
494
+ POWER9_M := $(shell grep "POWER9" /proc/cpuinfo)
495
+ ifneq (,$(findstring POWER9,$(POWER9_M)))
496
+ MK_CFLAGS += -mcpu=power9
497
+ MK_CXXFLAGS += -mcpu=power9
498
+ endif
499
+ endif
500
+
501
+ ifneq ($(filter ppc64le%,$(UNAME_M)),)
502
+ MK_CFLAGS += -mcpu=powerpc64le
503
+ MK_CXXFLAGS += -mcpu=powerpc64le
504
+ CUDA_POWER_ARCH = 1
505
+ endif
506
+
507
+ ifneq ($(filter loongarch64%,$(UNAME_M)),)
508
+ MK_CFLAGS += -mlasx
509
+ MK_CXXFLAGS += -mlasx
510
+ endif
511
+
512
+ ifneq ($(filter riscv64%,$(UNAME_M)),)
513
+ MK_CFLAGS += -march=rv64gcv -mabi=lp64d
514
+ MK_CXXFLAGS += -march=rv64gcv -mabi=lp64d
515
+ endif
516
+
517
+ else # RISC-V CROSS COMPILATION
518
+ MK_CFLAGS += -march=rv64gcv -mabi=lp64d
519
+ MK_CXXFLAGS += -march=rv64gcv -mabi=lp64d
520
+ endif
521
+
522
+ ifndef GGML_NO_ACCELERATE
523
+ # Mac OS - include Accelerate framework.
524
+ # `-framework Accelerate` works both with Apple Silicon and Mac Intel
525
+ ifeq ($(UNAME_S),Darwin)
526
+ MK_CPPFLAGS += -DGGML_USE_ACCELERATE -DGGML_USE_BLAS
527
+ MK_CPPFLAGS += -DACCELERATE_NEW_LAPACK
528
+ MK_CPPFLAGS += -DACCELERATE_LAPACK_ILP64
529
+ MK_LDFLAGS += -framework Accelerate
530
+ OBJ_GGML += ggml/src/ggml-blas.o
531
+ endif
532
+ endif # GGML_NO_ACCELERATE
533
+
534
+ ifdef GGML_MUSA
535
+ CC := clang
536
+ CXX := clang++
537
+ GGML_CUDA := 1
538
+ MK_CPPFLAGS += -DGGML_USE_MUSA
539
+ endif
540
+
541
+ ifndef GGML_NO_OPENMP
542
+ MK_CPPFLAGS += -DGGML_USE_OPENMP
543
+ MK_CFLAGS += -fopenmp
544
+ MK_CXXFLAGS += -fopenmp
545
+ ifdef GGML_MUSA
546
+ MK_CPPFLAGS += -I/usr/lib/llvm-10/include/openmp
547
+ MK_LDFLAGS += -L/usr/lib/llvm-10/lib
548
+ endif # GGML_MUSA
549
+ endif # GGML_NO_OPENMP
550
+
551
+ ifdef GGML_OPENBLAS
552
+ MK_CPPFLAGS += -DGGML_USE_BLAS $(shell pkg-config --cflags-only-I openblas)
553
+ MK_CFLAGS += $(shell pkg-config --cflags-only-other openblas)
554
+ MK_LDFLAGS += $(shell pkg-config --libs openblas)
555
+ OBJ_GGML += ggml/src/ggml-blas.o
556
+ endif # GGML_OPENBLAS
557
+
558
+ ifdef GGML_OPENBLAS64
559
+ MK_CPPFLAGS += -DGGML_USE_BLAS $(shell pkg-config --cflags-only-I openblas64)
560
+ MK_CFLAGS += $(shell pkg-config --cflags-only-other openblas64)
561
+ MK_LDFLAGS += $(shell pkg-config --libs openblas64)
562
+ OBJ_GGML += ggml/src/ggml-blas.o
563
+ endif # GGML_OPENBLAS64
564
+
565
+ ifdef GGML_BLIS
566
+ MK_CPPFLAGS += -DGGML_USE_BLAS -DGGML_BLAS_USE_BLIS -I/usr/local/include/blis -I/usr/include/blis
567
+ MK_LDFLAGS += -lblis -L/usr/local/lib
568
+ OBJ_GGML += ggml/src/ggml-blas.o
569
+ endif # GGML_BLIS
570
+
571
+ ifdef GGML_NVPL
572
+ MK_CPPFLAGS += -DGGML_USE_BLAS -DGGML_BLAS_USE_NVPL -DNVPL_ILP64 -I/usr/local/include/nvpl_blas -I/usr/include/nvpl_blas
573
+ MK_LDFLAGS += -L/usr/local/lib -lnvpl_blas_core -lnvpl_blas_ilp64_gomp
574
+ OBJ_GGML += ggml/src/ggml-blas.o
575
+ endif # GGML_NVPL
576
+
577
+ ifndef GGML_NO_LLAMAFILE
578
+ MK_CPPFLAGS += -DGGML_USE_LLAMAFILE
579
+ OBJ_GGML += ggml/src/llamafile/sgemm.o
580
+ endif
581
+
582
+ ifndef GGML_NO_AMX
583
+ MK_CPPFLAGS += -DGGML_USE_AMX
584
+ OBJ_GGML += ggml/src/ggml-amx.o ggml/src/ggml-amx/mmq.o
585
+ endif
586
+
587
+ ifdef GGML_RPC
588
+ MK_CPPFLAGS += -DGGML_USE_RPC
589
+ OBJ_GGML += ggml/src/ggml-rpc.o
590
+ endif # GGML_RPC
591
+
592
+ OBJ_CUDA_TMPL = $(patsubst %.cu,%.o,$(wildcard ggml/src/ggml-cuda/template-instances/fattn-wmma*.cu))
593
+ OBJ_CUDA_TMPL += $(patsubst %.cu,%.o,$(wildcard ggml/src/ggml-cuda/template-instances/mmq*.cu))
594
+
595
+ ifdef GGML_CUDA_FA_ALL_QUANTS
596
+ OBJ_CUDA_TMPL += $(patsubst %.cu,%.o,$(wildcard ggml/src/ggml-cuda/template-instances/fattn-vec*.cu))
597
+ else
598
+ OBJ_CUDA_TMPL += $(patsubst %.cu,%.o,$(wildcard ggml/src/ggml-cuda/template-instances/fattn-vec*q4_0-q4_0.cu))
599
+ OBJ_CUDA_TMPL += $(patsubst %.cu,%.o,$(wildcard ggml/src/ggml-cuda/template-instances/fattn-vec*q8_0-q8_0.cu))
600
+ OBJ_CUDA_TMPL += $(patsubst %.cu,%.o,$(wildcard ggml/src/ggml-cuda/template-instances/fattn-vec*f16-f16.cu))
601
+ endif # GGML_CUDA_FA_ALL_QUANTS
602
+
603
+ ifdef GGML_CUDA
604
+ ifdef GGML_MUSA
605
+ ifneq ('', '$(wildcard /opt/musa)')
606
+ CUDA_PATH ?= /opt/musa
607
+ else
608
+ CUDA_PATH ?= /usr/local/musa
609
+ endif
610
+
611
+ MK_CPPFLAGS += -DGGML_USE_CUDA -I$(CUDA_PATH)/include
612
+ MK_LDFLAGS += -lmusa -lmublas -lmusart -lpthread -ldl -lrt -L$(CUDA_PATH)/lib -L/usr/lib64
613
+ MK_NVCCFLAGS += -x musa -mtgpu --cuda-gpu-arch=mp_21 --cuda-gpu-arch=mp_22
614
+ else
615
+ ifneq ('', '$(wildcard /opt/cuda)')
616
+ CUDA_PATH ?= /opt/cuda
617
+ else
618
+ CUDA_PATH ?= /usr/local/cuda
619
+ endif
620
+
621
+ MK_CPPFLAGS += -DGGML_USE_CUDA -DGGML_CUDA_USE_GRAPHS -I$(CUDA_PATH)/include -I$(CUDA_PATH)/targets/$(UNAME_M)-linux/include
622
+ MK_LDFLAGS += -lcuda -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -L$(CUDA_PATH)/lib64 -L/usr/lib64 -L$(CUDA_PATH)/targets/$(UNAME_M)-linux/lib -L$(CUDA_PATH)/lib64/stubs -L/usr/lib/wsl/lib
623
+ MK_NVCCFLAGS += -use_fast_math
624
+ endif # GGML_MUSA
625
+
626
+ OBJ_GGML += ggml/src/ggml-cuda.o
627
+ OBJ_GGML += $(patsubst %.cu,%.o,$(wildcard ggml/src/ggml-cuda/*.cu))
628
+ OBJ_GGML += $(OBJ_CUDA_TMPL)
629
+
630
+ ifdef LLAMA_FATAL_WARNINGS
631
+ MK_NVCCFLAGS += -Werror all-warnings
632
+ endif # LLAMA_FATAL_WARNINGS
633
+
634
+ ifndef GGML_MUSA
635
+ ifndef JETSON_EOL_MODULE_DETECT
636
+ MK_NVCCFLAGS += --forward-unknown-to-host-compiler
637
+ endif # JETSON_EOL_MODULE_DETECT
638
+ endif # GGML_MUSA
639
+
640
+ ifdef LLAMA_DEBUG
641
+ MK_NVCCFLAGS += -lineinfo
642
+ endif # LLAMA_DEBUG
643
+
644
+ ifdef GGML_CUDA_DEBUG
645
+ MK_NVCCFLAGS += --device-debug
646
+ endif # GGML_CUDA_DEBUG
647
+
648
+ ifdef GGML_CUDA_NVCC
649
+ NVCC = $(CCACHE) $(GGML_CUDA_NVCC)
650
+ else
651
+ ifdef GGML_MUSA
652
+ NVCC = $(CCACHE) mcc
653
+ else
654
+ NVCC = $(CCACHE) nvcc
655
+ endif # GGML_MUSA
656
+ endif # GGML_CUDA_NVCC
657
+
658
+ ifdef CUDA_DOCKER_ARCH
659
+ MK_NVCCFLAGS += -Wno-deprecated-gpu-targets -arch=$(CUDA_DOCKER_ARCH)
660
+ else ifndef CUDA_POWER_ARCH
661
+ MK_NVCCFLAGS += -arch=native
662
+ endif # CUDA_DOCKER_ARCH
663
+
664
+ ifdef GGML_CUDA_FORCE_DMMV
665
+ MK_NVCCFLAGS += -DGGML_CUDA_FORCE_DMMV
666
+ endif # GGML_CUDA_FORCE_DMMV
667
+
668
+ ifdef GGML_CUDA_FORCE_MMQ
669
+ MK_NVCCFLAGS += -DGGML_CUDA_FORCE_MMQ
670
+ endif # GGML_CUDA_FORCE_MMQ
671
+
672
+ ifdef GGML_CUDA_FORCE_CUBLAS
673
+ MK_NVCCFLAGS += -DGGML_CUDA_FORCE_CUBLAS
674
+ endif # GGML_CUDA_FORCE_CUBLAS
675
+
676
+ ifdef GGML_CUDA_DMMV_X
677
+ MK_NVCCFLAGS += -DGGML_CUDA_DMMV_X=$(GGML_CUDA_DMMV_X)
678
+ else
679
+ MK_NVCCFLAGS += -DGGML_CUDA_DMMV_X=32
680
+ endif # GGML_CUDA_DMMV_X
681
+
682
+ ifdef GGML_CUDA_MMV_Y
683
+ MK_NVCCFLAGS += -DGGML_CUDA_MMV_Y=$(GGML_CUDA_MMV_Y)
684
+ else ifdef GGML_CUDA_DMMV_Y
685
+ MK_NVCCFLAGS += -DGGML_CUDA_MMV_Y=$(GGML_CUDA_DMMV_Y) # for backwards compatibility
686
+ else
687
+ MK_NVCCFLAGS += -DGGML_CUDA_MMV_Y=1
688
+ endif # GGML_CUDA_MMV_Y
689
+
690
+ ifdef GGML_CUDA_F16
691
+ MK_NVCCFLAGS += -DGGML_CUDA_F16
692
+ endif # GGML_CUDA_F16
693
+
694
+ ifdef GGML_CUDA_DMMV_F16
695
+ MK_NVCCFLAGS += -DGGML_CUDA_F16
696
+ endif # GGML_CUDA_DMMV_F16
697
+
698
+ ifdef GGML_CUDA_KQUANTS_ITER
699
+ MK_NVCCFLAGS += -DK_QUANTS_PER_ITERATION=$(GGML_CUDA_KQUANTS_ITER)
700
+ else
701
+ MK_NVCCFLAGS += -DK_QUANTS_PER_ITERATION=2
702
+ endif
703
+
704
+ ifdef GGML_CUDA_PEER_MAX_BATCH_SIZE
705
+ MK_NVCCFLAGS += -DGGML_CUDA_PEER_MAX_BATCH_SIZE=$(GGML_CUDA_PEER_MAX_BATCH_SIZE)
706
+ else
707
+ MK_NVCCFLAGS += -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128
708
+ endif # GGML_CUDA_PEER_MAX_BATCH_SIZE
709
+
710
+ ifdef GGML_CUDA_NO_PEER_COPY
711
+ MK_NVCCFLAGS += -DGGML_CUDA_NO_PEER_COPY
712
+ endif # GGML_CUDA_NO_PEER_COPY
713
+
714
+ ifdef GGML_CUDA_CCBIN
715
+ MK_NVCCFLAGS += -ccbin $(GGML_CUDA_CCBIN)
716
+ endif # GGML_CUDA_CCBIN
717
+
718
+ ifdef GGML_CUDA_FA_ALL_QUANTS
719
+ MK_NVCCFLAGS += -DGGML_CUDA_FA_ALL_QUANTS
720
+ endif # GGML_CUDA_FA_ALL_QUANTS
721
+
722
+ ifdef JETSON_EOL_MODULE_DETECT
723
+ define NVCC_COMPILE
724
+ $(NVCC) -I. -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_CUDA -I/usr/local/cuda/include -I/opt/cuda/include -I/usr/local/cuda/targets/aarch64-linux/include -std=c++11 -O3 $(NVCCFLAGS) $(CPPFLAGS) -Xcompiler "$(CUDA_CXXFLAGS)" -c $< -o $@
725
+ endef # NVCC_COMPILE
726
+ else
727
+ ifdef GGML_MUSA
728
+ define NVCC_COMPILE
729
+ $(NVCC) $(NVCCFLAGS) $(CPPFLAGS) -c $< -o $@
730
+ endef # NVCC_COMPILE
731
+ else
732
+ define NVCC_COMPILE
733
+ $(NVCC) $(NVCCFLAGS) $(CPPFLAGS) -Xcompiler "$(CUDA_CXXFLAGS)" -c $< -o $@
734
+ endef # NVCC_COMPILE
735
+ endif # GGML_MUSA
736
+ endif # JETSON_EOL_MODULE_DETECT
737
+
738
+ ggml/src/ggml-cuda/%.o: \
739
+ ggml/src/ggml-cuda/%.cu \
740
+ ggml/include/ggml.h \
741
+ ggml/src/ggml-common.h \
742
+ ggml/src/ggml-cuda/common.cuh
743
+ $(NVCC_COMPILE)
744
+
745
+ ggml/src/ggml-cuda.o: \
746
+ ggml/src/ggml-cuda.cu \
747
+ ggml/include/ggml-cuda.h \
748
+ ggml/include/ggml.h \
749
+ ggml/include/ggml-backend.h \
750
+ ggml/src/ggml-backend-impl.h \
751
+ ggml/src/ggml-common.h \
752
+ $(wildcard ggml/src/ggml-cuda/*.cuh)
753
+ $(NVCC_COMPILE)
754
+ endif # GGML_CUDA
755
+
756
+ ifdef GGML_VULKAN
757
+ MK_CPPFLAGS += -DGGML_USE_VULKAN
758
+ MK_LDFLAGS += $(shell pkg-config --libs vulkan)
759
+ OBJ_GGML += ggml/src/ggml-vulkan.o ggml/src/ggml-vulkan-shaders.o
760
+
761
+ ifdef GGML_VULKAN_CHECK_RESULTS
762
+ MK_CPPFLAGS += -DGGML_VULKAN_CHECK_RESULTS
763
+ endif
764
+
765
+ ifdef GGML_VULKAN_DEBUG
766
+ MK_CPPFLAGS += -DGGML_VULKAN_DEBUG
767
+ endif
768
+
769
+ ifdef GGML_VULKAN_MEMORY_DEBUG
770
+ MK_CPPFLAGS += -DGGML_VULKAN_MEMORY_DEBUG
771
+ endif
772
+
773
+ ifdef GGML_VULKAN_PERF
774
+ MK_CPPFLAGS += -DGGML_VULKAN_PERF
775
+ endif
776
+
777
+ ifdef GGML_VULKAN_VALIDATE
778
+ MK_CPPFLAGS += -DGGML_VULKAN_VALIDATE
779
+ endif
780
+
781
+ ifdef GGML_VULKAN_RUN_TESTS
782
+ MK_CPPFLAGS += -DGGML_VULKAN_RUN_TESTS
783
+ endif
784
+
785
+ GLSLC_CMD = glslc
786
+ _ggml_vk_genshaders_cmd = $(shell pwd)/vulkan-shaders-gen
787
+ _ggml_vk_header = ggml/src/ggml-vulkan-shaders.hpp
788
+ _ggml_vk_source = ggml/src/ggml-vulkan-shaders.cpp
789
+ _ggml_vk_input_dir = ggml/src/vulkan-shaders
790
+ _ggml_vk_shader_deps = $(echo $(_ggml_vk_input_dir)/*.comp)
791
+
792
+ ggml/src/ggml-vulkan.o: ggml/src/ggml-vulkan.cpp ggml/include/ggml-vulkan.h $(_ggml_vk_header) $(_ggml_vk_source)
793
+ $(CXX) $(CXXFLAGS) $(shell pkg-config --cflags vulkan) -c $< -o $@
794
+
795
+ $(_ggml_vk_header): $(_ggml_vk_source)
796
+
797
+ $(_ggml_vk_source): $(_ggml_vk_shader_deps) vulkan-shaders-gen
798
+ $(_ggml_vk_genshaders_cmd) \
799
+ --glslc $(GLSLC_CMD) \
800
+ --input-dir $(_ggml_vk_input_dir) \
801
+ --target-hpp $(_ggml_vk_header) \
802
+ --target-cpp $(_ggml_vk_source)
803
+
804
+ vulkan-shaders-gen: ggml/src/vulkan-shaders/vulkan-shaders-gen.cpp
805
+ $(CXX) $(CXXFLAGS) -o $@ $(LDFLAGS) ggml/src/vulkan-shaders/vulkan-shaders-gen.cpp
806
+
807
+ endif # GGML_VULKAN
808
+
809
+ ifdef GGML_HIPBLAS
810
+ ifeq ($(wildcard /opt/rocm),)
811
+ ROCM_PATH ?= /usr
812
+ AMDGPU_TARGETS ?= $(shell $(shell which amdgpu-arch))
813
+ else
814
+ ROCM_PATH ?= /opt/rocm
815
+ AMDGPU_TARGETS ?= $(shell $(ROCM_PATH)/llvm/bin/amdgpu-arch)
816
+ endif
817
+
818
+ GGML_CUDA_DMMV_X ?= 32
819
+ GGML_CUDA_MMV_Y ?= 1
820
+ GGML_CUDA_KQUANTS_ITER ?= 2
821
+
822
+ MK_CPPFLAGS += -DGGML_USE_HIPBLAS -DGGML_USE_CUDA
823
+
824
+ ifdef GGML_HIP_UMA
825
+ MK_CPPFLAGS += -DGGML_HIP_UMA
826
+ endif # GGML_HIP_UMA
827
+
828
+ MK_LDFLAGS += -L$(ROCM_PATH)/lib -Wl,-rpath=$(ROCM_PATH)/lib
829
+ MK_LDFLAGS += -L$(ROCM_PATH)/lib64 -Wl,-rpath=$(ROCM_PATH)/lib64
830
+ MK_LDFLAGS += -lhipblas -lamdhip64 -lrocblas
831
+
832
+ HIPCC ?= $(CCACHE) $(ROCM_PATH)/bin/hipcc
833
+
834
+ HIPFLAGS += $(addprefix --offload-arch=,$(AMDGPU_TARGETS))
835
+ HIPFLAGS += -DGGML_CUDA_DMMV_X=$(GGML_CUDA_DMMV_X)
836
+ HIPFLAGS += -DGGML_CUDA_MMV_Y=$(GGML_CUDA_MMV_Y)
837
+ HIPFLAGS += -DK_QUANTS_PER_ITERATION=$(GGML_CUDA_KQUANTS_ITER)
838
+
839
+ ifdef GGML_CUDA_FORCE_DMMV
840
+ HIPFLAGS += -DGGML_CUDA_FORCE_DMMV
841
+ endif # GGML_CUDA_FORCE_DMMV
842
+
843
+ ifdef GGML_CUDA_FORCE_MMQ
844
+ HIPFLAGS += -DGGML_CUDA_FORCE_MMQ
845
+ endif # GGML_CUDA_FORCE_MMQ
846
+
847
+ ifdef GGML_CUDA_FORCE_CUBLAS
848
+ HIPFLAGS += -DGGML_CUDA_FORCE_CUBLAS
849
+ endif # GGML_CUDA_FORCE_CUBLAS
850
+
851
+ ifdef GGML_CUDA_NO_PEER_COPY
852
+ HIPFLAGS += -DGGML_CUDA_NO_PEER_COPY
853
+ endif # GGML_CUDA_NO_PEER_COPY
854
+
855
+ OBJ_GGML += ggml/src/ggml-cuda.o
856
+ OBJ_GGML += $(patsubst %.cu,%.o,$(wildcard ggml/src/ggml-cuda/*.cu))
857
+ OBJ_GGML += $(OBJ_CUDA_TMPL)
858
+
859
+ ggml/src/ggml-cuda.o: \
860
+ ggml/src/ggml-cuda.cu \
861
+ ggml/include/ggml-cuda.h \
862
+ ggml/include/ggml.h \
863
+ ggml/include/ggml-backend.h \
864
+ ggml/src/ggml-backend-impl.h \
865
+ ggml/src/ggml-common.h \
866
+ $(wildcard ggml/src/ggml-cuda/*.cuh)
867
+ $(HIPCC) $(CXXFLAGS) $(HIPFLAGS) -x hip -c -o $@ $<
868
+
869
+ ggml/src/ggml-cuda/%.o: \
870
+ ggml/src/ggml-cuda/%.cu \
871
+ ggml/include/ggml.h \
872
+ ggml/src/ggml-common.h \
873
+ ggml/src/ggml-cuda/common.cuh
874
+ $(HIPCC) $(CXXFLAGS) $(HIPFLAGS) -x hip -c -o $@ $<
875
+ endif # GGML_HIPBLAS
876
+
877
+ ifdef GGML_METAL
878
+ MK_CPPFLAGS += -DGGML_USE_METAL
879
+ MK_LDFLAGS += -framework Foundation -framework Metal -framework MetalKit
880
+ OBJ_GGML += ggml/src/ggml-metal.o
881
+
882
+ ifdef GGML_METAL_USE_BF16
883
+ MK_CPPFLAGS += -DGGML_METAL_USE_BF16
884
+ endif # GGML_METAL_USE_BF16
885
+ ifdef GGML_METAL_NDEBUG
886
+ MK_CPPFLAGS += -DGGML_METAL_NDEBUG
887
+ endif
888
+ ifdef GGML_METAL_EMBED_LIBRARY
889
+ MK_CPPFLAGS += -DGGML_METAL_EMBED_LIBRARY
890
+ OBJ_GGML += ggml/src/ggml-metal-embed.o
891
+ endif
892
+ endif # GGML_METAL
893
+
894
+ ifdef GGML_METAL
895
+ ggml/src/ggml-metal.o: \
896
+ ggml/src/ggml-metal.m \
897
+ ggml/include/ggml-metal.h \
898
+ ggml/include/ggml.h
899
+ $(CC) $(CFLAGS) -c $< -o $@
900
+
901
+ ifdef GGML_METAL_EMBED_LIBRARY
902
+ ggml/src/ggml-metal-embed.o: \
903
+ ggml/src/ggml-metal.metal \
904
+ ggml/src/ggml-common.h
905
+ @echo "Embedding Metal library"
906
+ @sed -e '/#include "ggml-common.h"/r ggml/src/ggml-common.h' -e '/#include "ggml-common.h"/d' < ggml/src/ggml-metal.metal > ggml/src/ggml-metal-embed.metal
907
+ $(eval TEMP_ASSEMBLY=$(shell mktemp -d))
908
+ @echo ".section __DATA, __ggml_metallib" > $(TEMP_ASSEMBLY)/ggml-metal-embed.s
909
+ @echo ".globl _ggml_metallib_start" >> $(TEMP_ASSEMBLY)/ggml-metal-embed.s
910
+ @echo "_ggml_metallib_start:" >> $(TEMP_ASSEMBLY)/ggml-metal-embed.s
911
+ @echo ".incbin \"ggml/src/ggml-metal-embed.metal\"" >> $(TEMP_ASSEMBLY)/ggml-metal-embed.s
912
+ @echo ".globl _ggml_metallib_end" >> $(TEMP_ASSEMBLY)/ggml-metal-embed.s
913
+ @echo "_ggml_metallib_end:" >> $(TEMP_ASSEMBLY)/ggml-metal-embed.s
914
+ $(CC) $(CFLAGS) -c $(TEMP_ASSEMBLY)/ggml-metal-embed.s -o $@
915
+ @rm -f ${TEMP_ASSEMBLY}/ggml-metal-embed.s
916
+ @rmdir ${TEMP_ASSEMBLY}
917
+ endif
918
+ endif # GGML_METAL
919
+
920
+ OBJ_GGML += \
921
+ ggml/src/ggml.o \
922
+ ggml/src/ggml-cpu.o \
923
+ ggml/src/ggml-alloc.o \
924
+ ggml/src/ggml-backend.o \
925
+ ggml/src/ggml-quants.o \
926
+ ggml/src/ggml-aarch64.o
927
+
928
+ OBJ_LLAMA = \
929
+ src/llama.o \
930
+ src/llama-vocab.o \
931
+ src/llama-grammar.o \
932
+ src/llama-sampling.o \
933
+ src/unicode.o \
934
+ src/unicode-data.o
935
+
936
+ OBJ_COMMON = \
937
+ common/common.o \
938
+ common/arg.o \
939
+ common/log.o \
940
+ common/console.o \
941
+ common/ngram-cache.o \
942
+ common/sampling.o \
943
+ common/build-info.o \
944
+ common/json-schema-to-grammar.o
945
+
946
+ OBJ_ALL = $(OBJ_GGML) $(OBJ_LLAMA) $(OBJ_COMMON)
947
+
948
+ LIB_GGML = $(LIB_PRE)ggml$(DSO_EXT)
949
+ LIB_GGML_S = $(LIB_PRE)ggml.a
950
+
951
+ LIB_LLAMA = $(LIB_PRE)llama$(DSO_EXT)
952
+ LIB_LLAMA_S = $(LIB_PRE)llama.a
953
+
954
+ LIB_COMMON = $(LIB_PRE)common$(DSO_EXT)
955
+ LIB_COMMON_S = $(LIB_PRE)common.a
956
+
957
+ LIB_ALL = $(LIB_GGML) $(LIB_LLAMA) $(LIB_COMMON)
958
+ LIB_ALL_S = $(LIB_GGML_S) $(LIB_LLAMA_S) $(LIB_COMMON_S)
959
+
960
+ GF_CC := $(CC)
961
+ include scripts/get-flags.mk
962
+
963
+ # combine build flags with cmdline overrides
964
+ override CPPFLAGS := $(MK_CPPFLAGS) $(CPPFLAGS)
965
+ override CFLAGS := $(CPPFLAGS) $(MK_CFLAGS) $(GF_CFLAGS) $(CFLAGS)
966
+ BASE_CXXFLAGS := $(MK_CXXFLAGS) $(CXXFLAGS)
967
+ override CXXFLAGS := $(BASE_CXXFLAGS) $(HOST_CXXFLAGS) $(GF_CXXFLAGS) $(CPPFLAGS)
968
+ override NVCCFLAGS := $(MK_NVCCFLAGS) $(NVCCFLAGS)
969
+ override LDFLAGS := $(MK_LDFLAGS) $(LDFLAGS)
970
+
971
+ # identify CUDA host compiler
972
+ ifdef GGML_CUDA
973
+ GF_CC := $(NVCC) $(NVCCFLAGS) 2>/dev/null .c -Xcompiler
974
+ include scripts/get-flags.mk
975
+ CUDA_CXXFLAGS := $(BASE_CXXFLAGS) $(GF_CXXFLAGS) -Wno-pedantic
976
+ endif
977
+
978
+ ifdef LLAMA_CURL
979
+ override CXXFLAGS := $(CXXFLAGS) -DLLAMA_USE_CURL
980
+ override LDFLAGS := $(LDFLAGS) -lcurl
981
+ endif
982
+
983
+ #
984
+ # Print build information
985
+ #
986
+
987
+ $(info I llama.cpp build info: )
988
+ $(info I UNAME_S: $(UNAME_S))
989
+ $(info I UNAME_P: $(UNAME_P))
990
+ $(info I UNAME_M: $(UNAME_M))
991
+ $(info I CFLAGS: $(CFLAGS))
992
+ $(info I CXXFLAGS: $(CXXFLAGS))
993
+ $(info I NVCCFLAGS: $(NVCCFLAGS))
994
+ $(info I LDFLAGS: $(LDFLAGS))
995
+ $(info I CC: $(shell $(CC) --version | head -n 1))
996
+ $(info I CXX: $(shell $(CXX) --version | head -n 1))
997
+ ifdef GGML_CUDA
998
+ $(info I NVCC: $(shell $(NVCC) --version | tail -n 1))
999
+ CUDA_VERSION := $(shell $(NVCC) --version | grep -oP 'release (\K[0-9]+\.[0-9])')
1000
+ ifndef GGML_MUSA
1001
+ ifeq ($(shell awk -v "v=$(CUDA_VERSION)" 'BEGIN { print (v < 11.7) }'),1)
1002
+
1003
+ ifndef CUDA_DOCKER_ARCH
1004
+ ifndef CUDA_POWER_ARCH
1005
+ $(error I ERROR: For CUDA versions < 11.7 a target CUDA architecture must be explicitly provided via environment variable CUDA_DOCKER_ARCH, e.g. by running "export CUDA_DOCKER_ARCH=compute_XX" on Unix-like systems, where XX is the minimum compute capability that the code needs to run on. A list with compute capabilities can be found here: https://developer.nvidia.com/cuda-gpus )
1006
+ endif # CUDA_POWER_ARCH
1007
+ endif # CUDA_DOCKER_ARCH
1008
+
1009
+ endif # eq ($(shell echo "$(CUDA_VERSION) < 11.7" | bc),1)
1010
+ endif # GGML_MUSA
1011
+ endif # GGML_CUDA
1012
+ $(info )
1013
+
1014
+ ifdef DEPRECATE_WARNING
1015
+ $(info !!! DEPRECATION WARNING !!!)
1016
+ $(info The following LLAMA_ options are deprecated and will be removed in the future. Use the GGML_ prefix instead)
1017
+ $(info - LLAMA_CUDA)
1018
+ $(info - LLAMA_METAL)
1019
+ $(info - LLAMA_METAL_EMBED_LIBRARY)
1020
+ $(info - LLAMA_OPENMP)
1021
+ $(info - LLAMA_RPC)
1022
+ $(info - LLAMA_SYCL)
1023
+ $(info - LLAMA_SYCL_F16)
1024
+ $(info - LLAMA_OPENBLAS)
1025
+ $(info - LLAMA_OPENBLAS64)
1026
+ $(info - LLAMA_BLIS)
1027
+ $(info - LLAMA_NO_LLAMAFILE)
1028
+ $(info - LLAMA_NO_ACCELERATE)
1029
+ $(info - LLAMA_NO_OPENMP)
1030
+ $(info - LLAMA_NO_METAL)
1031
+ $(info - LLAMA_NO_CCACHE)
1032
+ $(info )
1033
+ endif
1034
+
1035
+ ifdef REMOVE_WARNING
1036
+ $(info !!! REMOVAL WARNING !!!)
1037
+ $(info The following LLAMA_ options have been removed and are no longer supported)
1038
+ $(info - LLAMA_DISABLE_LOGS (https://github.com/ggerganov/llama.cpp/pull/9418))
1039
+ $(info - LLAMA_SERVER_VERBOSE (https://github.com/ggerganov/llama.cpp/pull/9418))
1040
+ $(info )
1041
+ endif
1042
+
1043
+ #
1044
+ # Build libraries
1045
+ #
1046
+
1047
+ # ggml
1048
+
1049
+ ggml/src/ggml.o: \
1050
+ ggml/src/ggml.c \
1051
+ ggml/include/ggml.h
1052
+ $(CC) $(CFLAGS) -c $< -o $@
1053
+
1054
+ ggml/src/ggml-cpu.o: \
1055
+ ggml/src/ggml-cpu.c \
1056
+ ggml/include/ggml.h \
1057
+ ggml/src/ggml-common.h
1058
+ $(CC) $(CFLAGS) -c $< -o $@
1059
+
1060
+ ggml/src/ggml-alloc.o: \
1061
+ ggml/src/ggml-alloc.c \
1062
+ ggml/include/ggml.h \
1063
+ ggml/include/ggml-alloc.h
1064
+ $(CC) $(CFLAGS) -c $< -o $@
1065
+
1066
+ ggml/src/ggml-backend.o: \
1067
+ ggml/src/ggml-backend.cpp \
1068
+ ggml/src/ggml-backend-impl.h \
1069
+ ggml/include/ggml.h \
1070
+ ggml/include/ggml-backend.h
1071
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1072
+
1073
+ ggml/src/ggml-quants.o: \
1074
+ ggml/src/ggml-quants.c \
1075
+ ggml/include/ggml.h \
1076
+ ggml/src/ggml-quants.h \
1077
+ ggml/src/ggml-common.h
1078
+ $(CC) $(CFLAGS) -c $< -o $@
1079
+
1080
+ ggml/src/ggml-aarch64.o: \
1081
+ ggml/src/ggml-aarch64.c \
1082
+ ggml/include/ggml.h \
1083
+ ggml/src/ggml-aarch64.h \
1084
+ ggml/src/ggml-common.h
1085
+ $(CC) $(CFLAGS) -c $< -o $@
1086
+
1087
+ ggml/src/ggml-blas.o: \
1088
+ ggml/src/ggml-blas.cpp \
1089
+ ggml/include/ggml-blas.h
1090
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1091
+
1092
+ ifndef GGML_NO_LLAMAFILE
1093
+ ggml/src/llamafile/sgemm.o: \
1094
+ ggml/src/llamafile/sgemm.cpp \
1095
+ ggml/src/llamafile/sgemm.h \
1096
+ ggml/include/ggml.h
1097
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1098
+ endif # GGML_NO_LLAMAFILE
1099
+
1100
+ ifndef GGML_NO_AMX
1101
+ ggml/src/ggml-amx.o: \
1102
+ ggml/src/ggml-amx.cpp \
1103
+ ggml/include/ggml-amx.h
1104
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1105
+
1106
+ ggml/src/ggml-amx/mmq.o: \
1107
+ ggml/src/ggml-amx/mmq.cpp \
1108
+ ggml/src/ggml-amx/mmq.h \
1109
+ ggml/include/ggml.h
1110
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1111
+ endif
1112
+
1113
+ ifdef GGML_RPC
1114
+ ggml/src/ggml-rpc.o: \
1115
+ ggml/src/ggml-rpc.cpp \
1116
+ ggml/include/ggml-rpc.h
1117
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1118
+ endif # GGML_RPC
1119
+
1120
+ $(LIB_GGML): \
1121
+ $(OBJ_GGML)
1122
+ $(CXX) $(CXXFLAGS) -shared -fPIC -o $@ $^ $(LDFLAGS)
1123
+
1124
+ $(LIB_GGML_S): \
1125
+ $(OBJ_GGML)
1126
+ ar rcs $(LIB_GGML_S) $^
1127
+
1128
+ # llama
1129
+
1130
+ src/unicode.o: \
1131
+ src/unicode.cpp \
1132
+ src/unicode.h
1133
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1134
+
1135
+ src/unicode-data.o: \
1136
+ src/unicode-data.cpp \
1137
+ src/unicode-data.h
1138
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1139
+
1140
+ src/llama.o: \
1141
+ src/llama.cpp \
1142
+ src/llama-impl.h \
1143
+ src/llama-vocab.h \
1144
+ src/llama-grammar.h \
1145
+ src/llama-sampling.h \
1146
+ src/unicode.h \
1147
+ include/llama.h \
1148
+ ggml/include/ggml-cuda.h \
1149
+ ggml/include/ggml-metal.h \
1150
+ ggml/include/ggml.h \
1151
+ ggml/include/ggml-alloc.h \
1152
+ ggml/include/ggml-backend.h
1153
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1154
+
1155
+ src/llama-vocab.o: \
1156
+ src/llama-vocab.cpp \
1157
+ src/llama-vocab.h \
1158
+ src/llama-impl.h \
1159
+ include/llama.h
1160
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1161
+
1162
+ src/llama-grammar.o: \
1163
+ src/llama-grammar.cpp \
1164
+ src/llama-grammar.h \
1165
+ src/llama-impl.h \
1166
+ src/llama-vocab.h \
1167
+ src/llama-sampling.h \
1168
+ include/llama.h
1169
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1170
+
1171
+ src/llama-sampling.o: \
1172
+ src/llama-sampling.cpp \
1173
+ src/llama-sampling.h \
1174
+ src/llama-impl.h \
1175
+ include/llama.h
1176
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1177
+
1178
+ $(LIB_LLAMA): \
1179
+ $(OBJ_LLAMA) \
1180
+ $(LIB_GGML)
1181
+ $(CXX) $(CXXFLAGS) -shared -fPIC -o $@ $^ $(LDFLAGS)
1182
+
1183
+ $(LIB_LLAMA_S): \
1184
+ $(OBJ_LLAMA)
1185
+ ar rcs $(LIB_LLAMA_S) $^
1186
+
1187
+ # common
1188
+
1189
+ common/common.o: \
1190
+ common/common.cpp \
1191
+ common/common.h \
1192
+ common/console.h \
1193
+ common/sampling.h \
1194
+ common/json.hpp \
1195
+ common/json-schema-to-grammar.h \
1196
+ include/llama.h
1197
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1198
+
1199
+ common/arg.o: \
1200
+ common/arg.cpp \
1201
+ common/arg.h
1202
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1203
+
1204
+ common/log.o: \
1205
+ common/log.cpp \
1206
+ common/log.h
1207
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1208
+
1209
+ common/sampling.o: \
1210
+ common/sampling.cpp \
1211
+ common/sampling.h \
1212
+ include/llama.h
1213
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1214
+
1215
+ common/console.o: \
1216
+ common/console.cpp \
1217
+ common/console.h
1218
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1219
+
1220
+ common/json-schema-to-grammar.o: \
1221
+ common/json-schema-to-grammar.cpp \
1222
+ common/json-schema-to-grammar.h
1223
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1224
+
1225
+ common/ngram-cache.o: \
1226
+ common/ngram-cache.cpp \
1227
+ common/ngram-cache.h
1228
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1229
+
1230
+ $(LIB_COMMON): \
1231
+ $(OBJ_COMMON) \
1232
+ $(LIB_LLAMA) \
1233
+ $(LIB_GGML)
1234
+ $(CXX) $(CXXFLAGS) -shared -fPIC -o $@ $^ $(LDFLAGS)
1235
+
1236
+ $(LIB_COMMON_S): \
1237
+ $(OBJ_COMMON)
1238
+ ar rcs $(LIB_COMMON_S) $^
1239
+
1240
+ clean:
1241
+ rm -vrf *.dot $(BUILD_TARGETS) $(TEST_TARGETS)
1242
+ rm -rvf src/*.o
1243
+ rm -rvf tests/*.o
1244
+ rm -rvf examples/*.o
1245
+ rm -rvf common/*.o
1246
+ rm -rvf *.a
1247
+ rm -rvf *.dll
1248
+ rm -rvf *.so
1249
+ rm -rvf *.dot
1250
+ rm -rvf ggml/*.a
1251
+ rm -rvf ggml/*.dll
1252
+ rm -rvf ggml/*.so
1253
+ rm -vrf ggml/src/*.o
1254
+ rm -rvf ggml/src/llamafile/*.o
1255
+ rm -rvf common/build-info.cpp
1256
+ rm -vrf ggml/src/ggml-metal-embed.metal
1257
+ rm -vrf ggml/src/ggml-cuda/*.o
1258
+ rm -vrf ggml/src/ggml-cuda/template-instances/*.o
1259
+ rm -vrf ggml/src/ggml-amx/*.o
1260
+ rm -rvf $(BUILD_TARGETS)
1261
+ rm -rvf $(TEST_TARGETS)
1262
+ rm -f vulkan-shaders-gen ggml/src/ggml-vulkan-shaders.hpp ggml/src/ggml-vulkan-shaders.cpp
1263
+ rm -rvf $(LEGACY_TARGETS_CLEAN)
1264
+ find examples pocs -type f -name "*.o" -delete
1265
+
1266
+ #
1267
+ # Examples
1268
+ #
1269
+
1270
+ # $< is the first prerequisite, i.e. the source file.
1271
+ # Explicitly compile this to an object file so that it can be cached with ccache.
1272
+ # The source file is then filtered out from $^ (the list of all prerequisites) and the object file is added instead.
1273
+
1274
+ # Helper function that replaces .c, .cpp, and .cu file endings with .o:
1275
+ GET_OBJ_FILE = $(patsubst %.c,%.o,$(patsubst %.cpp,%.o,$(patsubst %.cu,%.o,$(1))))
1276
+
1277
+ llama-cli: examples/main/main.cpp \
1278
+ $(OBJ_ALL)
1279
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1280
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1281
+ @echo
1282
+ @echo '==== Run ./llama-cli -h for help. ===='
1283
+ @echo
1284
+
1285
+ llama-infill: examples/infill/infill.cpp \
1286
+ $(OBJ_ALL)
1287
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1288
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1289
+
1290
+ llama-simple: examples/simple/simple.cpp \
1291
+ $(OBJ_ALL)
1292
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1293
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1294
+
1295
+ llama-simple-chat: examples/simple-chat/simple-chat.cpp \
1296
+ $(OBJ_ALL)
1297
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1298
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1299
+
1300
+ llama-tokenize: examples/tokenize/tokenize.cpp \
1301
+ $(OBJ_ALL)
1302
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1303
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1304
+
1305
+ llama-batched: examples/batched/batched.cpp \
1306
+ $(OBJ_ALL)
1307
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1308
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1309
+
1310
+ llama-batched-bench: examples/batched-bench/batched-bench.cpp \
1311
+ $(OBJ_ALL)
1312
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1313
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1314
+
1315
+ llama-quantize: examples/quantize/quantize.cpp \
1316
+ $(OBJ_ALL)
1317
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1318
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1319
+
1320
+ llama-quantize-stats: examples/quantize-stats/quantize-stats.cpp \
1321
+ $(OBJ_ALL)
1322
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1323
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1324
+
1325
+ llama-perplexity: examples/perplexity/perplexity.cpp \
1326
+ $(OBJ_ALL)
1327
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1328
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1329
+
1330
+ llama-imatrix: examples/imatrix/imatrix.cpp \
1331
+ $(OBJ_ALL)
1332
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1333
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1334
+
1335
+ llama-embedding: examples/embedding/embedding.cpp \
1336
+ $(OBJ_ALL)
1337
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1338
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1339
+
1340
+ llama-gritlm: examples/gritlm/gritlm.cpp \
1341
+ $(OBJ_ALL)
1342
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1343
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1344
+
1345
+ llama-save-load-state: examples/save-load-state/save-load-state.cpp \
1346
+ $(OBJ_ALL)
1347
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1348
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1349
+
1350
+ llama-gguf: examples/gguf/gguf.cpp \
1351
+ $(OBJ_GGML)
1352
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1353
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1354
+
1355
+ examples/gguf-hash/deps/sha1/sha1.o: \
1356
+ examples/gguf-hash/deps/sha1/sha1.c
1357
+ $(CC) $(CFLAGS) -Iexamples/gguf-hash/deps -c $< -o $@
1358
+
1359
+ examples/gguf-hash/deps/xxhash/xxhash.o: \
1360
+ examples/gguf-hash/deps/xxhash/xxhash.c
1361
+ $(CC) $(CFLAGS) -Iexamples/gguf-hash/deps -c $< -o $@
1362
+
1363
+ examples/gguf-hash/deps/sha256/sha256.o: \
1364
+ examples/gguf-hash/deps/sha256/sha256.c
1365
+ $(CC) $(CFLAGS) -Iexamples/gguf-hash/deps -c $< -o $@
1366
+
1367
+ llama-gguf-hash: examples/gguf-hash/gguf-hash.cpp examples/gguf-hash/deps/sha1/sha1.o examples/gguf-hash/deps/xxhash/xxhash.o examples/gguf-hash/deps/sha256/sha256.o\
1368
+ $(OBJ_ALL)
1369
+ $(CXX) $(CXXFLAGS) -Iexamples/gguf-hash/deps -c $< -o $(call GET_OBJ_FILE, $<)
1370
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1371
+
1372
+ llama-gguf-split: examples/gguf-split/gguf-split.cpp \
1373
+ $(OBJ_ALL)
1374
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1375
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1376
+
1377
+ llama-eval-callback: examples/eval-callback/eval-callback.cpp \
1378
+ $(OBJ_ALL)
1379
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1380
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1381
+
1382
+ llama-cvector-generator: examples/cvector-generator/cvector-generator.cpp \
1383
+ $(OBJ_ALL)
1384
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1385
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1386
+
1387
+ llama-convert-llama2c-to-ggml: examples/convert-llama2c-to-ggml/convert-llama2c-to-ggml.cpp \
1388
+ $(OBJ_ALL)
1389
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1390
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1391
+
1392
+ llama-bench: examples/llama-bench/llama-bench.cpp \
1393
+ $(OBJ_ALL)
1394
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1395
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1396
+
1397
+ llama-export-lora: examples/export-lora/export-lora.cpp \
1398
+ $(OBJ_ALL)
1399
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1400
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1401
+
1402
+ llama-retrieval: examples/retrieval/retrieval.cpp \
1403
+ $(OBJ_ALL)
1404
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1405
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1406
+
1407
+ llama-speculative: examples/speculative/speculative.cpp \
1408
+ $(OBJ_ALL)
1409
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1410
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1411
+
1412
+ llama-parallel: examples/parallel/parallel.cpp \
1413
+ $(OBJ_ALL)
1414
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1415
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1416
+
1417
+ llama-lookahead: examples/lookahead/lookahead.cpp \
1418
+ $(OBJ_ALL)
1419
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1420
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1421
+
1422
+ llama-lookup: examples/lookup/lookup.cpp \
1423
+ $(OBJ_ALL)
1424
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1425
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1426
+
1427
+ llama-lookup-create: examples/lookup/lookup-create.cpp \
1428
+ $(OBJ_ALL)
1429
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1430
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1431
+
1432
+ llama-lookup-merge: examples/lookup/lookup-merge.cpp \
1433
+ $(OBJ_ALL)
1434
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1435
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1436
+
1437
+ llama-lookup-stats: examples/lookup/lookup-stats.cpp \
1438
+ $(OBJ_ALL)
1439
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1440
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1441
+
1442
+ llama-passkey: examples/passkey/passkey.cpp \
1443
+ $(OBJ_ALL)
1444
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1445
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1446
+
1447
+ llama-gbnf-validator: examples/gbnf-validator/gbnf-validator.cpp \
1448
+ $(OBJ_ALL)
1449
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1450
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1451
+
1452
+ ifdef GGML_RPC
1453
+ rpc-server: examples/rpc/rpc-server.cpp \
1454
+ $(OBJ_GGML)
1455
+ $(CXX) $(CXXFLAGS) $^ -o $@ $(LDFLAGS)
1456
+ endif # GGML_RPC
1457
+
1458
+ llama-server: \
1459
+ examples/server/server.cpp \
1460
+ examples/server/utils.hpp \
1461
+ examples/server/httplib.h \
1462
+ examples/server/index.html.hpp \
1463
+ examples/server/completion.js.hpp \
1464
+ examples/server/loading.html.hpp \
1465
+ examples/server/deps_daisyui.min.css.hpp \
1466
+ examples/server/deps_markdown-it.js.hpp \
1467
+ examples/server/deps_tailwindcss.js.hpp \
1468
+ examples/server/deps_vue.esm-browser.js.hpp \
1469
+ common/json.hpp \
1470
+ common/stb_image.h \
1471
+ $(OBJ_ALL)
1472
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1473
+ $(CXX) $(CXXFLAGS) $(filter-out %.h %.hpp $<,$^) -Iexamples/server $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS) $(LWINSOCK2)
1474
+
1475
+ # Portable equivalent of `cd examples/server/public && xxd -i $(notdir $<) ../$(notdir $<).hpp`:
1476
+ examples/server/%.hpp: examples/server/public/% Makefile
1477
+ @( export NAME=$(subst .,_,$(subst -,_,$(notdir $<))) && \
1478
+ echo "unsigned char $${NAME}[] = {" && \
1479
+ cat $< | od -v -t x1 -An | sed -E 's/([0-9a-fA-F]+)/0x\1, /g' && \
1480
+ echo "};" && \
1481
+ echo "unsigned int $${NAME}_len = $(shell cat $< | wc -c );" \
1482
+ ) > $@
1483
+
1484
+ llama-gen-docs: examples/gen-docs/gen-docs.cpp \
1485
+ $(OBJ_ALL)
1486
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1487
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1488
+
1489
+ libllava.a: examples/llava/llava.cpp \
1490
+ examples/llava/llava.h \
1491
+ examples/llava/clip.cpp \
1492
+ examples/llava/clip.h \
1493
+ common/stb_image.h \
1494
+ common/base64.hpp \
1495
+ $(OBJ_ALL)
1496
+ $(CXX) $(CXXFLAGS) -static -fPIC -c $< -o $@ -Wno-cast-qual
1497
+
1498
+ llama-llava-cli: examples/llava/llava-cli.cpp \
1499
+ examples/llava/llava.cpp \
1500
+ examples/llava/llava.h \
1501
+ examples/llava/clip.cpp \
1502
+ examples/llava/clip.h \
1503
+ $(OBJ_ALL)
1504
+ $(CXX) $(CXXFLAGS) $< $(filter-out %.h $<,$^) -o $@ $(LDFLAGS) -Wno-cast-qual
1505
+
1506
+ llama-minicpmv-cli: examples/llava/minicpmv-cli.cpp \
1507
+ examples/llava/llava.cpp \
1508
+ examples/llava/llava.h \
1509
+ examples/llava/clip.cpp \
1510
+ examples/llava/clip.h \
1511
+ $(OBJ_ALL)
1512
+ $(CXX) $(CXXFLAGS) $< $(filter-out %.h $<,$^) -o $@ $(LDFLAGS) -Wno-cast-qual
1513
+
1514
+ ifeq ($(UNAME_S),Darwin)
1515
+ swift: examples/batched.swift
1516
+ (cd examples/batched.swift; make build)
1517
+ endif
1518
+
1519
+ common/build-info.cpp: $(wildcard .git/index) scripts/build-info.sh
1520
+ @sh scripts/build-info.sh "$(CC)" > [email protected]
1521
+ @if ! cmp -s [email protected] $@; then \
1522
+ mv [email protected] $@; \
1523
+ else \
1524
1525
+ fi
1526
+
1527
+ common/build-info.o: common/build-info.cpp
1528
+ $(CXX) $(CXXFLAGS) -c $(filter-out %.h,$^) -o $@
1529
+
1530
+ #
1531
+ # Tests
1532
+ #
1533
+
1534
+ tests: $(TEST_TARGETS)
1535
+
1536
+ tests/test-arg-parser: tests/test-arg-parser.cpp \
1537
+ $(OBJ_ALL)
1538
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1539
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1540
+
1541
+ tests/test-llama-grammar: tests/test-llama-grammar.cpp \
1542
+ $(OBJ_ALL)
1543
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1544
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1545
+
1546
+ tests/test-log: tests/test-log.cpp \
1547
+ $(OBJ_ALL)
1548
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1549
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1550
+
1551
+ tests/test-grammar-parser: tests/test-grammar-parser.cpp \
1552
+ $(OBJ_ALL)
1553
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1554
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1555
+
1556
+ tests/test-grammar-integration: tests/test-grammar-integration.cpp \
1557
+ $(OBJ_ALL)
1558
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1559
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1560
+
1561
+ tests/test-double-float: tests/test-double-float.cpp
1562
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1563
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1564
+
1565
+ tests/test-json-schema-to-grammar: tests/test-json-schema-to-grammar.cpp \
1566
+ $(OBJ_ALL)
1567
+ $(CXX) $(CXXFLAGS) -Iexamples/server -c $< -o $(call GET_OBJ_FILE, $<)
1568
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1569
+
1570
+ tests/test-grad0: tests/test-grad0.cpp \
1571
+ $(OBJ_GGML)
1572
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1573
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1574
+
1575
+ tests/test-opt: tests/test-opt.cpp \
1576
+ $(OBJ_GGML)
1577
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1578
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1579
+
1580
+ tests/test-quantize-fns: tests/test-quantize-fns.cpp \
1581
+ $(OBJ_GGML)
1582
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1583
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1584
+
1585
+ tests/test-quantize-perf: tests/test-quantize-perf.cpp \
1586
+ $(OBJ_GGML)
1587
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1588
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1589
+
1590
+ tests/test-sampling: tests/test-sampling.cpp \
1591
+ $(OBJ_ALL)
1592
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1593
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1594
+
1595
+ tests/test-tokenizer-0: tests/test-tokenizer-0.cpp \
1596
+ $(OBJ_ALL)
1597
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1598
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1599
+
1600
+ tests/test-tokenizer-1-bpe: tests/test-tokenizer-1-bpe.cpp \
1601
+ $(OBJ_ALL)
1602
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1603
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1604
+
1605
+ tests/test-tokenizer-1-spm: tests/test-tokenizer-1-spm.cpp \
1606
+ $(OBJ_ALL)
1607
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1608
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1609
+
1610
+ tests/test-rope: tests/test-rope.cpp ggml/src/ggml.o \
1611
+ $(OBJ_GGML)
1612
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1613
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1614
+
1615
+ tests/test-c.o: tests/test-c.c include/llama.h
1616
+ $(CC) $(CFLAGS) -c $(filter-out %.h,$^) -o $@
1617
+
1618
+ tests/test-backend-ops: tests/test-backend-ops.cpp \
1619
+ $(OBJ_GGML)
1620
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1621
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1622
+
1623
+ tests/test-model-load-cancel: tests/test-model-load-cancel.cpp tests/get-model.cpp \
1624
+ $(OBJ_ALL)
1625
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1626
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1627
+
1628
+ tests/test-autorelease: tests/test-autorelease.cpp tests/get-model.cpp \
1629
+ $(OBJ_ALL)
1630
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1631
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1632
+
1633
+ tests/test-chat-template: tests/test-chat-template.cpp \
1634
+ $(OBJ_ALL)
1635
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1636
+ $(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1637
+
1638
+ #
1639
+ # PoCs
1640
+ #
1641
+
1642
+ llama-vdot: pocs/vdot/vdot.cpp ggml/src/ggml.o \
1643
+ $(OBJ_GGML)
1644
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1645
+ $(CXX) $(CXXFLAGS) $(filter-out $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1646
+
1647
+ llama-q8dot: pocs/vdot/q8dot.cpp ggml/src/ggml.o \
1648
+ $(OBJ_GGML)
1649
+ $(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
1650
+ $(CXX) $(CXXFLAGS) $(filter-out $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)
1651
+
1652
+ #
1653
+ # Deprecated binaries that we want to keep around long enough for people to migrate to the new filenames, then these can be removed.
1654
+ #
1655
+ # Mark legacy binary targets as .PHONY so that they are always checked.
1656
+ .PHONY: main quantize perplexity embedding server
1657
+
1658
+ # Define the object file target
1659
+ examples/deprecation-warning/deprecation-warning.o: examples/deprecation-warning/deprecation-warning.cpp
1660
+ $(CXX) $(CXXFLAGS) -c $< -o $@
1661
+
1662
+ # NOTE: We currently will always build the deprecation-warning `main` and `server` binaries to help users migrate.
1663
+ # Eventually we will want to remove these target from building all the time.
1664
+ main: examples/deprecation-warning/deprecation-warning.o
1665
+ $(CXX) $(CXXFLAGS) $< -o $@ $(LDFLAGS)
1666
+ @echo "NOTICE: The 'main' binary is deprecated. Please use 'llama-cli' instead."
1667
+
1668
+ server: examples/deprecation-warning/deprecation-warning.o
1669
+ $(CXX) $(CXXFLAGS) $< -o $@ $(LDFLAGS)
1670
+ @echo "NOTICE: The 'server' binary is deprecated. Please use 'llama-server' instead."
1671
+
1672
+ quantize: examples/deprecation-warning/deprecation-warning.o
1673
+ ifneq (,$(wildcard quantize))
1674
+ $(CXX) $(CXXFLAGS) $< -o $@ $(LDFLAGS)
1675
+ @echo "#########"
1676
+ @echo "WARNING: The 'quantize' binary is deprecated. Please use 'llama-quantize' instead."
1677
+ @echo " Remove the 'quantize' binary to remove this warning."
1678
+ @echo "#########"
1679
+ endif
1680
+
1681
+ perplexity: examples/deprecation-warning/deprecation-warning.o
1682
+ ifneq (,$(wildcard perplexity))
1683
+ $(CXX) $(CXXFLAGS) $< -o $@ $(LDFLAGS)
1684
+ @echo "#########"
1685
+ @echo "WARNING: The 'perplexity' binary is deprecated. Please use 'llama-perplexity' instead."
1686
+ @echo " Remove the 'perplexity' binary to remove this warning."
1687
+ @echo "#########"
1688
+ endif
1689
+
1690
+ embedding: examples/deprecation-warning/deprecation-warning.o
1691
+ ifneq (,$(wildcard embedding))
1692
+ $(CXX) $(CXXFLAGS) $< -o $@ $(LDFLAGS)
1693
+ @echo "#########"
1694
+ @echo "WARNING: The 'embedding' binary is deprecated. Please use 'llama-embedding' instead."
1695
+ @echo " Remove the 'embedding' binary to remove this warning."
1696
+ @echo "#########"
1697
+ endif