akhaliq3 commited on
Commit
58597f0
1 Parent(s): ae48635

spaces demo

Browse files
Files changed (12) hide show
  1. LICENSE +674 -0
  2. app.py +55 -0
  3. chord_recognition.py +188 -0
  4. finetune.py +45 -0
  5. input.midi +0 -0
  6. main.py +31 -0
  7. model.py +294 -0
  8. modules.py +233 -0
  9. requirements.txt +4 -0
  10. result/continuation.midi +0 -0
  11. result/from_scratch.midi +0 -0
  12. utils.py +348 -0
LICENSE ADDED
@@ -0,0 +1,674 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ GNU GENERAL PUBLIC LICENSE
2
+ Version 3, 29 June 2007
3
+
4
+ Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
5
+ Everyone is permitted to copy and distribute verbatim copies
6
+ of this license document, but changing it is not allowed.
7
+
8
+ Preamble
9
+
10
+ The GNU General Public License is a free, copyleft license for
11
+ software and other kinds of works.
12
+
13
+ The licenses for most software and other practical works are designed
14
+ to take away your freedom to share and change the works. By contrast,
15
+ the GNU General Public License is intended to guarantee your freedom to
16
+ share and change all versions of a program--to make sure it remains free
17
+ software for all its users. We, the Free Software Foundation, use the
18
+ GNU General Public License for most of our software; it applies also to
19
+ any other work released this way by its authors. You can apply it to
20
+ your programs, too.
21
+
22
+ When we speak of free software, we are referring to freedom, not
23
+ price. Our General Public Licenses are designed to make sure that you
24
+ have the freedom to distribute copies of free software (and charge for
25
+ them if you wish), that you receive source code or can get it if you
26
+ want it, that you can change the software or use pieces of it in new
27
+ free programs, and that you know you can do these things.
28
+
29
+ To protect your rights, we need to prevent others from denying you
30
+ these rights or asking you to surrender the rights. Therefore, you have
31
+ certain responsibilities if you distribute copies of the software, or if
32
+ you modify it: responsibilities to respect the freedom of others.
33
+
34
+ For example, if you distribute copies of such a program, whether
35
+ gratis or for a fee, you must pass on to the recipients the same
36
+ freedoms that you received. You must make sure that they, too, receive
37
+ or can get the source code. And you must show them these terms so they
38
+ know their rights.
39
+
40
+ Developers that use the GNU GPL protect your rights with two steps:
41
+ (1) assert copyright on the software, and (2) offer you this License
42
+ giving you legal permission to copy, distribute and/or modify it.
43
+
44
+ For the developers' and authors' protection, the GPL clearly explains
45
+ that there is no warranty for this free software. For both users' and
46
+ authors' sake, the GPL requires that modified versions be marked as
47
+ changed, so that their problems will not be attributed erroneously to
48
+ authors of previous versions.
49
+
50
+ Some devices are designed to deny users access to install or run
51
+ modified versions of the software inside them, although the manufacturer
52
+ can do so. This is fundamentally incompatible with the aim of
53
+ protecting users' freedom to change the software. The systematic
54
+ pattern of such abuse occurs in the area of products for individuals to
55
+ use, which is precisely where it is most unacceptable. Therefore, we
56
+ have designed this version of the GPL to prohibit the practice for those
57
+ products. If such problems arise substantially in other domains, we
58
+ stand ready to extend this provision to those domains in future versions
59
+ of the GPL, as needed to protect the freedom of users.
60
+
61
+ Finally, every program is threatened constantly by software patents.
62
+ States should not allow patents to restrict development and use of
63
+ software on general-purpose computers, but in those that do, we wish to
64
+ avoid the special danger that patents applied to a free program could
65
+ make it effectively proprietary. To prevent this, the GPL assures that
66
+ patents cannot be used to render the program non-free.
67
+
68
+ The precise terms and conditions for copying, distribution and
69
+ modification follow.
70
+
71
+ TERMS AND CONDITIONS
72
+
73
+ 0. Definitions.
74
+
75
+ "This License" refers to version 3 of the GNU General Public License.
76
+
77
+ "Copyright" also means copyright-like laws that apply to other kinds of
78
+ works, such as semiconductor masks.
79
+
80
+ "The Program" refers to any copyrightable work licensed under this
81
+ License. Each licensee is addressed as "you". "Licensees" and
82
+ "recipients" may be individuals or organizations.
83
+
84
+ To "modify" a work means to copy from or adapt all or part of the work
85
+ in a fashion requiring copyright permission, other than the making of an
86
+ exact copy. The resulting work is called a "modified version" of the
87
+ earlier work or a work "based on" the earlier work.
88
+
89
+ A "covered work" means either the unmodified Program or a work based
90
+ on the Program.
91
+
92
+ To "propagate" a work means to do anything with it that, without
93
+ permission, would make you directly or secondarily liable for
94
+ infringement under applicable copyright law, except executing it on a
95
+ computer or modifying a private copy. Propagation includes copying,
96
+ distribution (with or without modification), making available to the
97
+ public, and in some countries other activities as well.
98
+
99
+ To "convey" a work means any kind of propagation that enables other
100
+ parties to make or receive copies. Mere interaction with a user through
101
+ a computer network, with no transfer of a copy, is not conveying.
102
+
103
+ An interactive user interface displays "Appropriate Legal Notices"
104
+ to the extent that it includes a convenient and prominently visible
105
+ feature that (1) displays an appropriate copyright notice, and (2)
106
+ tells the user that there is no warranty for the work (except to the
107
+ extent that warranties are provided), that licensees may convey the
108
+ work under this License, and how to view a copy of this License. If
109
+ the interface presents a list of user commands or options, such as a
110
+ menu, a prominent item in the list meets this criterion.
111
+
112
+ 1. Source Code.
113
+
114
+ The "source code" for a work means the preferred form of the work
115
+ for making modifications to it. "Object code" means any non-source
116
+ form of a work.
117
+
118
+ A "Standard Interface" means an interface that either is an official
119
+ standard defined by a recognized standards body, or, in the case of
120
+ interfaces specified for a particular programming language, one that
121
+ is widely used among developers working in that language.
122
+
123
+ The "System Libraries" of an executable work include anything, other
124
+ than the work as a whole, that (a) is included in the normal form of
125
+ packaging a Major Component, but which is not part of that Major
126
+ Component, and (b) serves only to enable use of the work with that
127
+ Major Component, or to implement a Standard Interface for which an
128
+ implementation is available to the public in source code form. A
129
+ "Major Component", in this context, means a major essential component
130
+ (kernel, window system, and so on) of the specific operating system
131
+ (if any) on which the executable work runs, or a compiler used to
132
+ produce the work, or an object code interpreter used to run it.
133
+
134
+ The "Corresponding Source" for a work in object code form means all
135
+ the source code needed to generate, install, and (for an executable
136
+ work) run the object code and to modify the work, including scripts to
137
+ control those activities. However, it does not include the work's
138
+ System Libraries, or general-purpose tools or generally available free
139
+ programs which are used unmodified in performing those activities but
140
+ which are not part of the work. For example, Corresponding Source
141
+ includes interface definition files associated with source files for
142
+ the work, and the source code for shared libraries and dynamically
143
+ linked subprograms that the work is specifically designed to require,
144
+ such as by intimate data communication or control flow between those
145
+ subprograms and other parts of the work.
146
+
147
+ The Corresponding Source need not include anything that users
148
+ can regenerate automatically from other parts of the Corresponding
149
+ Source.
150
+
151
+ The Corresponding Source for a work in source code form is that
152
+ same work.
153
+
154
+ 2. Basic Permissions.
155
+
156
+ All rights granted under this License are granted for the term of
157
+ copyright on the Program, and are irrevocable provided the stated
158
+ conditions are met. This License explicitly affirms your unlimited
159
+ permission to run the unmodified Program. The output from running a
160
+ covered work is covered by this License only if the output, given its
161
+ content, constitutes a covered work. This License acknowledges your
162
+ rights of fair use or other equivalent, as provided by copyright law.
163
+
164
+ You may make, run and propagate covered works that you do not
165
+ convey, without conditions so long as your license otherwise remains
166
+ in force. You may convey covered works to others for the sole purpose
167
+ of having them make modifications exclusively for you, or provide you
168
+ with facilities for running those works, provided that you comply with
169
+ the terms of this License in conveying all material for which you do
170
+ not control copyright. Those thus making or running the covered works
171
+ for you must do so exclusively on your behalf, under your direction
172
+ and control, on terms that prohibit them from making any copies of
173
+ your copyrighted material outside their relationship with you.
174
+
175
+ Conveying under any other circumstances is permitted solely under
176
+ the conditions stated below. Sublicensing is not allowed; section 10
177
+ makes it unnecessary.
178
+
179
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
180
+
181
+ No covered work shall be deemed part of an effective technological
182
+ measure under any applicable law fulfilling obligations under article
183
+ 11 of the WIPO copyright treaty adopted on 20 December 1996, or
184
+ similar laws prohibiting or restricting circumvention of such
185
+ measures.
186
+
187
+ When you convey a covered work, you waive any legal power to forbid
188
+ circumvention of technological measures to the extent such circumvention
189
+ is effected by exercising rights under this License with respect to
190
+ the covered work, and you disclaim any intention to limit operation or
191
+ modification of the work as a means of enforcing, against the work's
192
+ users, your or third parties' legal rights to forbid circumvention of
193
+ technological measures.
194
+
195
+ 4. Conveying Verbatim Copies.
196
+
197
+ You may convey verbatim copies of the Program's source code as you
198
+ receive it, in any medium, provided that you conspicuously and
199
+ appropriately publish on each copy an appropriate copyright notice;
200
+ keep intact all notices stating that this License and any
201
+ non-permissive terms added in accord with section 7 apply to the code;
202
+ keep intact all notices of the absence of any warranty; and give all
203
+ recipients a copy of this License along with the Program.
204
+
205
+ You may charge any price or no price for each copy that you convey,
206
+ and you may offer support or warranty protection for a fee.
207
+
208
+ 5. Conveying Modified Source Versions.
209
+
210
+ You may convey a work based on the Program, or the modifications to
211
+ produce it from the Program, in the form of source code under the
212
+ terms of section 4, provided that you also meet all of these conditions:
213
+
214
+ a) The work must carry prominent notices stating that you modified
215
+ it, and giving a relevant date.
216
+
217
+ b) The work must carry prominent notices stating that it is
218
+ released under this License and any conditions added under section
219
+ 7. This requirement modifies the requirement in section 4 to
220
+ "keep intact all notices".
221
+
222
+ c) You must license the entire work, as a whole, under this
223
+ License to anyone who comes into possession of a copy. This
224
+ License will therefore apply, along with any applicable section 7
225
+ additional terms, to the whole of the work, and all its parts,
226
+ regardless of how they are packaged. This License gives no
227
+ permission to license the work in any other way, but it does not
228
+ invalidate such permission if you have separately received it.
229
+
230
+ d) If the work has interactive user interfaces, each must display
231
+ Appropriate Legal Notices; however, if the Program has interactive
232
+ interfaces that do not display Appropriate Legal Notices, your
233
+ work need not make them do so.
234
+
235
+ A compilation of a covered work with other separate and independent
236
+ works, which are not by their nature extensions of the covered work,
237
+ and which are not combined with it such as to form a larger program,
238
+ in or on a volume of a storage or distribution medium, is called an
239
+ "aggregate" if the compilation and its resulting copyright are not
240
+ used to limit the access or legal rights of the compilation's users
241
+ beyond what the individual works permit. Inclusion of a covered work
242
+ in an aggregate does not cause this License to apply to the other
243
+ parts of the aggregate.
244
+
245
+ 6. Conveying Non-Source Forms.
246
+
247
+ You may convey a covered work in object code form under the terms
248
+ of sections 4 and 5, provided that you also convey the
249
+ machine-readable Corresponding Source under the terms of this License,
250
+ in one of these ways:
251
+
252
+ a) Convey the object code in, or embodied in, a physical product
253
+ (including a physical distribution medium), accompanied by the
254
+ Corresponding Source fixed on a durable physical medium
255
+ customarily used for software interchange.
256
+
257
+ b) Convey the object code in, or embodied in, a physical product
258
+ (including a physical distribution medium), accompanied by a
259
+ written offer, valid for at least three years and valid for as
260
+ long as you offer spare parts or customer support for that product
261
+ model, to give anyone who possesses the object code either (1) a
262
+ copy of the Corresponding Source for all the software in the
263
+ product that is covered by this License, on a durable physical
264
+ medium customarily used for software interchange, for a price no
265
+ more than your reasonable cost of physically performing this
266
+ conveying of source, or (2) access to copy the
267
+ Corresponding Source from a network server at no charge.
268
+
269
+ c) Convey individual copies of the object code with a copy of the
270
+ written offer to provide the Corresponding Source. This
271
+ alternative is allowed only occasionally and noncommercially, and
272
+ only if you received the object code with such an offer, in accord
273
+ with subsection 6b.
274
+
275
+ d) Convey the object code by offering access from a designated
276
+ place (gratis or for a charge), and offer equivalent access to the
277
+ Corresponding Source in the same way through the same place at no
278
+ further charge. You need not require recipients to copy the
279
+ Corresponding Source along with the object code. If the place to
280
+ copy the object code is a network server, the Corresponding Source
281
+ may be on a different server (operated by you or a third party)
282
+ that supports equivalent copying facilities, provided you maintain
283
+ clear directions next to the object code saying where to find the
284
+ Corresponding Source. Regardless of what server hosts the
285
+ Corresponding Source, you remain obligated to ensure that it is
286
+ available for as long as needed to satisfy these requirements.
287
+
288
+ e) Convey the object code using peer-to-peer transmission, provided
289
+ you inform other peers where the object code and Corresponding
290
+ Source of the work are being offered to the general public at no
291
+ charge under subsection 6d.
292
+
293
+ A separable portion of the object code, whose source code is excluded
294
+ from the Corresponding Source as a System Library, need not be
295
+ included in conveying the object code work.
296
+
297
+ A "User Product" is either (1) a "consumer product", which means any
298
+ tangible personal property which is normally used for personal, family,
299
+ or household purposes, or (2) anything designed or sold for incorporation
300
+ into a dwelling. In determining whether a product is a consumer product,
301
+ doubtful cases shall be resolved in favor of coverage. For a particular
302
+ product received by a particular user, "normally used" refers to a
303
+ typical or common use of that class of product, regardless of the status
304
+ of the particular user or of the way in which the particular user
305
+ actually uses, or expects or is expected to use, the product. A product
306
+ is a consumer product regardless of whether the product has substantial
307
+ commercial, industrial or non-consumer uses, unless such uses represent
308
+ the only significant mode of use of the product.
309
+
310
+ "Installation Information" for a User Product means any methods,
311
+ procedures, authorization keys, or other information required to install
312
+ and execute modified versions of a covered work in that User Product from
313
+ a modified version of its Corresponding Source. The information must
314
+ suffice to ensure that the continued functioning of the modified object
315
+ code is in no case prevented or interfered with solely because
316
+ modification has been made.
317
+
318
+ If you convey an object code work under this section in, or with, or
319
+ specifically for use in, a User Product, and the conveying occurs as
320
+ part of a transaction in which the right of possession and use of the
321
+ User Product is transferred to the recipient in perpetuity or for a
322
+ fixed term (regardless of how the transaction is characterized), the
323
+ Corresponding Source conveyed under this section must be accompanied
324
+ by the Installation Information. But this requirement does not apply
325
+ if neither you nor any third party retains the ability to install
326
+ modified object code on the User Product (for example, the work has
327
+ been installed in ROM).
328
+
329
+ The requirement to provide Installation Information does not include a
330
+ requirement to continue to provide support service, warranty, or updates
331
+ for a work that has been modified or installed by the recipient, or for
332
+ the User Product in which it has been modified or installed. Access to a
333
+ network may be denied when the modification itself materially and
334
+ adversely affects the operation of the network or violates the rules and
335
+ protocols for communication across the network.
336
+
337
+ Corresponding Source conveyed, and Installation Information provided,
338
+ in accord with this section must be in a format that is publicly
339
+ documented (and with an implementation available to the public in
340
+ source code form), and must require no special password or key for
341
+ unpacking, reading or copying.
342
+
343
+ 7. Additional Terms.
344
+
345
+ "Additional permissions" are terms that supplement the terms of this
346
+ License by making exceptions from one or more of its conditions.
347
+ Additional permissions that are applicable to the entire Program shall
348
+ be treated as though they were included in this License, to the extent
349
+ that they are valid under applicable law. If additional permissions
350
+ apply only to part of the Program, that part may be used separately
351
+ under those permissions, but the entire Program remains governed by
352
+ this License without regard to the additional permissions.
353
+
354
+ When you convey a copy of a covered work, you may at your option
355
+ remove any additional permissions from that copy, or from any part of
356
+ it. (Additional permissions may be written to require their own
357
+ removal in certain cases when you modify the work.) You may place
358
+ additional permissions on material, added by you to a covered work,
359
+ for which you have or can give appropriate copyright permission.
360
+
361
+ Notwithstanding any other provision of this License, for material you
362
+ add to a covered work, you may (if authorized by the copyright holders of
363
+ that material) supplement the terms of this License with terms:
364
+
365
+ a) Disclaiming warranty or limiting liability differently from the
366
+ terms of sections 15 and 16 of this License; or
367
+
368
+ b) Requiring preservation of specified reasonable legal notices or
369
+ author attributions in that material or in the Appropriate Legal
370
+ Notices displayed by works containing it; or
371
+
372
+ c) Prohibiting misrepresentation of the origin of that material, or
373
+ requiring that modified versions of such material be marked in
374
+ reasonable ways as different from the original version; or
375
+
376
+ d) Limiting the use for publicity purposes of names of licensors or
377
+ authors of the material; or
378
+
379
+ e) Declining to grant rights under trademark law for use of some
380
+ trade names, trademarks, or service marks; or
381
+
382
+ f) Requiring indemnification of licensors and authors of that
383
+ material by anyone who conveys the material (or modified versions of
384
+ it) with contractual assumptions of liability to the recipient, for
385
+ any liability that these contractual assumptions directly impose on
386
+ those licensors and authors.
387
+
388
+ All other non-permissive additional terms are considered "further
389
+ restrictions" within the meaning of section 10. If the Program as you
390
+ received it, or any part of it, contains a notice stating that it is
391
+ governed by this License along with a term that is a further
392
+ restriction, you may remove that term. If a license document contains
393
+ a further restriction but permits relicensing or conveying under this
394
+ License, you may add to a covered work material governed by the terms
395
+ of that license document, provided that the further restriction does
396
+ not survive such relicensing or conveying.
397
+
398
+ If you add terms to a covered work in accord with this section, you
399
+ must place, in the relevant source files, a statement of the
400
+ additional terms that apply to those files, or a notice indicating
401
+ where to find the applicable terms.
402
+
403
+ Additional terms, permissive or non-permissive, may be stated in the
404
+ form of a separately written license, or stated as exceptions;
405
+ the above requirements apply either way.
406
+
407
+ 8. Termination.
408
+
409
+ You may not propagate or modify a covered work except as expressly
410
+ provided under this License. Any attempt otherwise to propagate or
411
+ modify it is void, and will automatically terminate your rights under
412
+ this License (including any patent licenses granted under the third
413
+ paragraph of section 11).
414
+
415
+ However, if you cease all violation of this License, then your
416
+ license from a particular copyright holder is reinstated (a)
417
+ provisionally, unless and until the copyright holder explicitly and
418
+ finally terminates your license, and (b) permanently, if the copyright
419
+ holder fails to notify you of the violation by some reasonable means
420
+ prior to 60 days after the cessation.
421
+
422
+ Moreover, your license from a particular copyright holder is
423
+ reinstated permanently if the copyright holder notifies you of the
424
+ violation by some reasonable means, this is the first time you have
425
+ received notice of violation of this License (for any work) from that
426
+ copyright holder, and you cure the violation prior to 30 days after
427
+ your receipt of the notice.
428
+
429
+ Termination of your rights under this section does not terminate the
430
+ licenses of parties who have received copies or rights from you under
431
+ this License. If your rights have been terminated and not permanently
432
+ reinstated, you do not qualify to receive new licenses for the same
433
+ material under section 10.
434
+
435
+ 9. Acceptance Not Required for Having Copies.
436
+
437
+ You are not required to accept this License in order to receive or
438
+ run a copy of the Program. Ancillary propagation of a covered work
439
+ occurring solely as a consequence of using peer-to-peer transmission
440
+ to receive a copy likewise does not require acceptance. However,
441
+ nothing other than this License grants you permission to propagate or
442
+ modify any covered work. These actions infringe copyright if you do
443
+ not accept this License. Therefore, by modifying or propagating a
444
+ covered work, you indicate your acceptance of this License to do so.
445
+
446
+ 10. Automatic Licensing of Downstream Recipients.
447
+
448
+ Each time you convey a covered work, the recipient automatically
449
+ receives a license from the original licensors, to run, modify and
450
+ propagate that work, subject to this License. You are not responsible
451
+ for enforcing compliance by third parties with this License.
452
+
453
+ An "entity transaction" is a transaction transferring control of an
454
+ organization, or substantially all assets of one, or subdividing an
455
+ organization, or merging organizations. If propagation of a covered
456
+ work results from an entity transaction, each party to that
457
+ transaction who receives a copy of the work also receives whatever
458
+ licenses to the work the party's predecessor in interest had or could
459
+ give under the previous paragraph, plus a right to possession of the
460
+ Corresponding Source of the work from the predecessor in interest, if
461
+ the predecessor has it or can get it with reasonable efforts.
462
+
463
+ You may not impose any further restrictions on the exercise of the
464
+ rights granted or affirmed under this License. For example, you may
465
+ not impose a license fee, royalty, or other charge for exercise of
466
+ rights granted under this License, and you may not initiate litigation
467
+ (including a cross-claim or counterclaim in a lawsuit) alleging that
468
+ any patent claim is infringed by making, using, selling, offering for
469
+ sale, or importing the Program or any portion of it.
470
+
471
+ 11. Patents.
472
+
473
+ A "contributor" is a copyright holder who authorizes use under this
474
+ License of the Program or a work on which the Program is based. The
475
+ work thus licensed is called the contributor's "contributor version".
476
+
477
+ A contributor's "essential patent claims" are all patent claims
478
+ owned or controlled by the contributor, whether already acquired or
479
+ hereafter acquired, that would be infringed by some manner, permitted
480
+ by this License, of making, using, or selling its contributor version,
481
+ but do not include claims that would be infringed only as a
482
+ consequence of further modification of the contributor version. For
483
+ purposes of this definition, "control" includes the right to grant
484
+ patent sublicenses in a manner consistent with the requirements of
485
+ this License.
486
+
487
+ Each contributor grants you a non-exclusive, worldwide, royalty-free
488
+ patent license under the contributor's essential patent claims, to
489
+ make, use, sell, offer for sale, import and otherwise run, modify and
490
+ propagate the contents of its contributor version.
491
+
492
+ In the following three paragraphs, a "patent license" is any express
493
+ agreement or commitment, however denominated, not to enforce a patent
494
+ (such as an express permission to practice a patent or covenant not to
495
+ sue for patent infringement). To "grant" such a patent license to a
496
+ party means to make such an agreement or commitment not to enforce a
497
+ patent against the party.
498
+
499
+ If you convey a covered work, knowingly relying on a patent license,
500
+ and the Corresponding Source of the work is not available for anyone
501
+ to copy, free of charge and under the terms of this License, through a
502
+ publicly available network server or other readily accessible means,
503
+ then you must either (1) cause the Corresponding Source to be so
504
+ available, or (2) arrange to deprive yourself of the benefit of the
505
+ patent license for this particular work, or (3) arrange, in a manner
506
+ consistent with the requirements of this License, to extend the patent
507
+ license to downstream recipients. "Knowingly relying" means you have
508
+ actual knowledge that, but for the patent license, your conveying the
509
+ covered work in a country, or your recipient's use of the covered work
510
+ in a country, would infringe one or more identifiable patents in that
511
+ country that you have reason to believe are valid.
512
+
513
+ If, pursuant to or in connection with a single transaction or
514
+ arrangement, you convey, or propagate by procuring conveyance of, a
515
+ covered work, and grant a patent license to some of the parties
516
+ receiving the covered work authorizing them to use, propagate, modify
517
+ or convey a specific copy of the covered work, then the patent license
518
+ you grant is automatically extended to all recipients of the covered
519
+ work and works based on it.
520
+
521
+ A patent license is "discriminatory" if it does not include within
522
+ the scope of its coverage, prohibits the exercise of, or is
523
+ conditioned on the non-exercise of one or more of the rights that are
524
+ specifically granted under this License. You may not convey a covered
525
+ work if you are a party to an arrangement with a third party that is
526
+ in the business of distributing software, under which you make payment
527
+ to the third party based on the extent of your activity of conveying
528
+ the work, and under which the third party grants, to any of the
529
+ parties who would receive the covered work from you, a discriminatory
530
+ patent license (a) in connection with copies of the covered work
531
+ conveyed by you (or copies made from those copies), or (b) primarily
532
+ for and in connection with specific products or compilations that
533
+ contain the covered work, unless you entered into that arrangement,
534
+ or that patent license was granted, prior to 28 March 2007.
535
+
536
+ Nothing in this License shall be construed as excluding or limiting
537
+ any implied license or other defenses to infringement that may
538
+ otherwise be available to you under applicable patent law.
539
+
540
+ 12. No Surrender of Others' Freedom.
541
+
542
+ If conditions are imposed on you (whether by court order, agreement or
543
+ otherwise) that contradict the conditions of this License, they do not
544
+ excuse you from the conditions of this License. If you cannot convey a
545
+ covered work so as to satisfy simultaneously your obligations under this
546
+ License and any other pertinent obligations, then as a consequence you may
547
+ not convey it at all. For example, if you agree to terms that obligate you
548
+ to collect a royalty for further conveying from those to whom you convey
549
+ the Program, the only way you could satisfy both those terms and this
550
+ License would be to refrain entirely from conveying the Program.
551
+
552
+ 13. Use with the GNU Affero General Public License.
553
+
554
+ Notwithstanding any other provision of this License, you have
555
+ permission to link or combine any covered work with a work licensed
556
+ under version 3 of the GNU Affero General Public License into a single
557
+ combined work, and to convey the resulting work. The terms of this
558
+ License will continue to apply to the part which is the covered work,
559
+ but the special requirements of the GNU Affero General Public License,
560
+ section 13, concerning interaction through a network will apply to the
561
+ combination as such.
562
+
563
+ 14. Revised Versions of this License.
564
+
565
+ The Free Software Foundation may publish revised and/or new versions of
566
+ the GNU General Public License from time to time. Such new versions will
567
+ be similar in spirit to the present version, but may differ in detail to
568
+ address new problems or concerns.
569
+
570
+ Each version is given a distinguishing version number. If the
571
+ Program specifies that a certain numbered version of the GNU General
572
+ Public License "or any later version" applies to it, you have the
573
+ option of following the terms and conditions either of that numbered
574
+ version or of any later version published by the Free Software
575
+ Foundation. If the Program does not specify a version number of the
576
+ GNU General Public License, you may choose any version ever published
577
+ by the Free Software Foundation.
578
+
579
+ If the Program specifies that a proxy can decide which future
580
+ versions of the GNU General Public License can be used, that proxy's
581
+ public statement of acceptance of a version permanently authorizes you
582
+ to choose that version for the Program.
583
+
584
+ Later license versions may give you additional or different
585
+ permissions. However, no additional obligations are imposed on any
586
+ author or copyright holder as a result of your choosing to follow a
587
+ later version.
588
+
589
+ 15. Disclaimer of Warranty.
590
+
591
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
592
+ APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
593
+ HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
594
+ OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
595
+ THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
596
+ PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
597
+ IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
598
+ ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
599
+
600
+ 16. Limitation of Liability.
601
+
602
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
603
+ WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
604
+ THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
605
+ GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
606
+ USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
607
+ DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
608
+ PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
609
+ EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
610
+ SUCH DAMAGES.
611
+
612
+ 17. Interpretation of Sections 15 and 16.
613
+
614
+ If the disclaimer of warranty and limitation of liability provided
615
+ above cannot be given local legal effect according to their terms,
616
+ reviewing courts shall apply local law that most closely approximates
617
+ an absolute waiver of all civil liability in connection with the
618
+ Program, unless a warranty or assumption of liability accompanies a
619
+ copy of the Program in return for a fee.
620
+
621
+ END OF TERMS AND CONDITIONS
622
+
623
+ How to Apply These Terms to Your New Programs
624
+
625
+ If you develop a new program, and you want it to be of the greatest
626
+ possible use to the public, the best way to achieve this is to make it
627
+ free software which everyone can redistribute and change under these terms.
628
+
629
+ To do so, attach the following notices to the program. It is safest
630
+ to attach them to the start of each source file to most effectively
631
+ state the exclusion of warranty; and each file should have at least
632
+ the "copyright" line and a pointer to where the full notice is found.
633
+
634
+ <one line to give the program's name and a brief idea of what it does.>
635
+ Copyright (C) <year> <name of author>
636
+
637
+ This program is free software: you can redistribute it and/or modify
638
+ it under the terms of the GNU General Public License as published by
639
+ the Free Software Foundation, either version 3 of the License, or
640
+ (at your option) any later version.
641
+
642
+ This program is distributed in the hope that it will be useful,
643
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
644
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
645
+ GNU General Public License for more details.
646
+
647
+ You should have received a copy of the GNU General Public License
648
+ along with this program. If not, see <https://www.gnu.org/licenses/>.
649
+
650
+ Also add information on how to contact you by electronic and paper mail.
651
+
652
+ If the program does terminal interaction, make it output a short
653
+ notice like this when it starts in an interactive mode:
654
+
655
+ <program> Copyright (C) <year> <name of author>
656
+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
657
+ This is free software, and you are welcome to redistribute it
658
+ under certain conditions; type `show c' for details.
659
+
660
+ The hypothetical commands `show w' and `show c' should show the appropriate
661
+ parts of the General Public License. Of course, your program's commands
662
+ might be different; for a GUI interface, you would use an "about box".
663
+
664
+ You should also get your employer (if you work as a programmer) or school,
665
+ if any, to sign a "copyright disclaimer" for the program, if necessary.
666
+ For more information on this, and how to apply and follow the GNU GPL, see
667
+ <https://www.gnu.org/licenses/>.
668
+
669
+ The GNU General Public License does not permit incorporating your program
670
+ into proprietary programs. If your program is a subroutine library, you
671
+ may consider it more useful to permit linking proprietary applications with
672
+ the library. If this is what you want to do, use the GNU Lesser General
673
+ Public License instead of this License. But first, please read
674
+ <https://www.gnu.org/licenses/why-not-lgpl.html>.
app.py ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from model import PopMusicTransformer
2
+ import os
3
+ os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
4
+ import tensorflow as tf
5
+ tf.compat.v1.disable_eager_execution()
6
+ import gradio as gr
7
+ import requests
8
+ import torchtext
9
+ import zipfile
10
+
11
+ torchtext.utils.download_from_url("https://drive.google.com/uc?id=1gxuTSkF51NP04JZgTE46Pg4KQsbHQKGo", root=".")
12
+ torchtext.utils.download_from_url("https://drive.google.com/uc?id=1nAKjaeahlzpVAX0F9wjQEG_hL4UosSbo", root=".")
13
+
14
+ with zipfile.ZipFile("REMI-tempo-checkpoint.zip","r") as zip_ref:
15
+ zip_ref.extractall(".")
16
+ with zipfile.ZipFile("REMI-tempo-chord-checkpoint.zip","r") as zip_ref:
17
+ zip_ref.extractall(".")
18
+
19
+ url = 'https://github.com/AK391/remi/blob/master/input.midi?raw=true'
20
+ r = requests.get(url, allow_redirects=True)
21
+ open("input.midi", 'wb').write(r.content)
22
+
23
+
24
+ # declare model
25
+ model = PopMusicTransformer(
26
+ checkpoint='REMI-tempo-checkpoint',
27
+ is_training=False)
28
+
29
+ def inference(midi):
30
+ # generate continuation
31
+ model.generate(
32
+ n_target_bar=4,
33
+ temperature=1.2,
34
+ topk=5,
35
+ output_path='./result/continuation.midi',
36
+ prompt=midi.name)
37
+ return './result/continuation.midi'
38
+
39
+
40
+ title = "Pop Music Transformer"
41
+ description = "demo for Pop Music Transformer. To use it, simply upload your midi file, or click one of the examples to load them. Read more at the links below."
42
+ article = "<p style='text-align: center'><a href='https://arxiv.org/abs/2002.00212'>Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions</a> | <a href='https://github.com/YatingMusic/remi'>Github Repo</a></p>"
43
+
44
+ examples = [
45
+ ['input.midi']
46
+ ]
47
+ gr.Interface(
48
+ inference,
49
+ gr.inputs.File(label="Input Midi"),
50
+ gr.outputs.File(label="Output Midi"),
51
+ title=title,
52
+ description=description,
53
+ article=article,
54
+ examples=examples
55
+ ).launch()
chord_recognition.py ADDED
@@ -0,0 +1,188 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import miditoolkit
2
+ import numpy as np
3
+
4
+ class MIDIChord(object):
5
+ def __init__(self):
6
+ # define pitch classes
7
+ self.PITCH_CLASSES = ['C', 'C#', 'D', 'D#', 'E', 'F', 'F#', 'G', 'G#', 'A', 'A#', 'B']
8
+ # define chord maps (required)
9
+ self.CHORD_MAPS = {'maj': [0, 4],
10
+ 'min': [0, 3],
11
+ 'dim': [0, 3, 6],
12
+ 'aug': [0, 4, 8],
13
+ 'dom': [0, 4, 7, 10]}
14
+ # define chord insiders (+1)
15
+ self.CHORD_INSIDERS = {'maj': [7],
16
+ 'min': [7],
17
+ 'dim': [9],
18
+ 'aug': [],
19
+ 'dom': []}
20
+ # define chord outsiders (-1)
21
+ self.CHORD_OUTSIDERS_1 = {'maj': [2, 5, 9],
22
+ 'min': [2, 5, 8],
23
+ 'dim': [2, 5, 10],
24
+ 'aug': [2, 5, 9],
25
+ 'dom': [2, 5, 9]}
26
+ # define chord outsiders (-2)
27
+ self.CHORD_OUTSIDERS_2 = {'maj': [1, 3, 6, 8, 10],
28
+ 'min': [1, 4, 6, 9, 11],
29
+ 'dim': [1, 4, 7, 8, 11],
30
+ 'aug': [1, 3, 6, 7, 10],
31
+ 'dom': [1, 3, 6, 8, 11]}
32
+
33
+ def note2pianoroll(self, notes, max_tick, ticks_per_beat):
34
+ return miditoolkit.pianoroll.parser.notes2pianoroll(
35
+ note_stream_ori=notes,
36
+ max_tick=max_tick,
37
+ ticks_per_beat=ticks_per_beat)
38
+
39
+ def sequencing(self, chroma):
40
+ candidates = {}
41
+ for index in range(len(chroma)):
42
+ if chroma[index]:
43
+ root_note = index
44
+ _chroma = np.roll(chroma, -root_note)
45
+ sequence = np.where(_chroma == 1)[0]
46
+ candidates[root_note] = list(sequence)
47
+ return candidates
48
+
49
+ def scoring(self, candidates):
50
+ scores = {}
51
+ qualities = {}
52
+ for root_note, sequence in candidates.items():
53
+ if 3 not in sequence and 4 not in sequence:
54
+ scores[root_note] = -100
55
+ qualities[root_note] = 'None'
56
+ elif 3 in sequence and 4 in sequence:
57
+ scores[root_note] = -100
58
+ qualities[root_note] = 'None'
59
+ else:
60
+ # decide quality
61
+ if 3 in sequence:
62
+ if 6 in sequence:
63
+ quality = 'dim'
64
+ else:
65
+ quality = 'min'
66
+ elif 4 in sequence:
67
+ if 8 in sequence:
68
+ quality = 'aug'
69
+ else:
70
+ if 7 in sequence and 10 in sequence:
71
+ quality = 'dom'
72
+ else:
73
+ quality = 'maj'
74
+ # decide score
75
+ maps = self.CHORD_MAPS.get(quality)
76
+ _notes = [n for n in sequence if n not in maps]
77
+ score = 0
78
+ for n in _notes:
79
+ if n in self.CHORD_OUTSIDERS_1.get(quality):
80
+ score -= 1
81
+ elif n in self.CHORD_OUTSIDERS_2.get(quality):
82
+ score -= 2
83
+ elif n in self.CHORD_INSIDERS.get(quality):
84
+ score += 1
85
+ scores[root_note] = score
86
+ qualities[root_note] = quality
87
+ return scores, qualities
88
+
89
+ def find_chord(self, pianoroll):
90
+ chroma = miditoolkit.pianoroll.utils.tochroma(pianoroll=pianoroll)
91
+ chroma = np.sum(chroma, axis=0)
92
+ chroma = np.array([1 if c else 0 for c in chroma])
93
+ if np.sum(chroma) == 0:
94
+ return 'N', 'N', 'N', 0
95
+ else:
96
+ candidates = self.sequencing(chroma=chroma)
97
+ scores, qualities = self.scoring(candidates=candidates)
98
+ # bass note
99
+ sorted_notes = []
100
+ for i, v in enumerate(np.sum(pianoroll, axis=0)):
101
+ if v > 0:
102
+ sorted_notes.append(int(i%12))
103
+ bass_note = sorted_notes[0]
104
+ # root note
105
+ __root_note = []
106
+ _max = max(scores.values())
107
+ for _root_note, score in scores.items():
108
+ if score == _max:
109
+ __root_note.append(_root_note)
110
+ if len(__root_note) == 1:
111
+ root_note = __root_note[0]
112
+ else:
113
+ #TODO: what should i do
114
+ for n in sorted_notes:
115
+ if n in __root_note:
116
+ root_note = n
117
+ break
118
+ # quality
119
+ quality = qualities.get(root_note)
120
+ sequence = candidates.get(root_note)
121
+ # score
122
+ score = scores.get(root_note)
123
+ return self.PITCH_CLASSES[root_note], quality, self.PITCH_CLASSES[bass_note], score
124
+
125
+ def greedy(self, candidates, max_tick, min_length):
126
+ chords = []
127
+ # start from 0
128
+ start_tick = 0
129
+ while start_tick < max_tick:
130
+ _candidates = candidates.get(start_tick)
131
+ _candidates = sorted(_candidates.items(), key=lambda x: (x[1][-1], x[0]))
132
+ # choose
133
+ end_tick, (root_note, quality, bass_note, _) = _candidates[-1]
134
+ if root_note == bass_note:
135
+ chord = '{}:{}'.format(root_note, quality)
136
+ else:
137
+ chord = '{}:{}/{}'.format(root_note, quality, bass_note)
138
+ chords.append([start_tick, end_tick, chord])
139
+ start_tick = end_tick
140
+ # remove :None
141
+ temp = chords
142
+ while ':None' in temp[0][-1]:
143
+ try:
144
+ temp[1][0] = temp[0][0]
145
+ del temp[0]
146
+ except:
147
+ print('NO CHORD')
148
+ return []
149
+ temp2 = []
150
+ for chord in temp:
151
+ if ':None' not in chord[-1]:
152
+ temp2.append(chord)
153
+ else:
154
+ temp2[-1][1] = chord[1]
155
+ return temp2
156
+
157
+ def extract(self, notes):
158
+ # read
159
+ max_tick = max([n.end for n in notes])
160
+ ticks_per_beat = 480
161
+ pianoroll = self.note2pianoroll(
162
+ notes=notes,
163
+ max_tick=max_tick,
164
+ ticks_per_beat=ticks_per_beat)
165
+ # get lots of candidates
166
+ candidates = {}
167
+ # the shortest: 2 beat, longest: 4 beat
168
+ for interval in [4, 2]:
169
+ for start_tick in range(0, max_tick, ticks_per_beat):
170
+ # set target pianoroll
171
+ end_tick = int(ticks_per_beat * interval + start_tick)
172
+ if end_tick > max_tick:
173
+ end_tick = max_tick
174
+ _pianoroll = pianoroll[start_tick:end_tick, :]
175
+ # find chord
176
+ root_note, quality, bass_note, score = self.find_chord(pianoroll=_pianoroll)
177
+ # save
178
+ if start_tick not in candidates:
179
+ candidates[start_tick] = {}
180
+ candidates[start_tick][end_tick] = (root_note, quality, bass_note, score)
181
+ else:
182
+ if end_tick not in candidates[start_tick]:
183
+ candidates[start_tick][end_tick] = (root_note, quality, bass_note, score)
184
+ # greedy
185
+ chords = self.greedy(candidates=candidates,
186
+ max_tick=max_tick,
187
+ min_length=ticks_per_beat)
188
+ return chords
finetune.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from model import PopMusicTransformer
2
+ from glob import glob
3
+ import os
4
+ os.environ['CUDA_VISIBLE_DEVICES'] = '0'
5
+
6
+ def main():
7
+ # declare model
8
+ model = PopMusicTransformer(
9
+ checkpoint='REMI-tempo-checkpoint',
10
+ is_training=True)
11
+ # prepare data
12
+ midi_paths = glob('YOUR PERSOANL FOLDER/*.midi') # you need to revise it
13
+ training_data = model.prepare_data(midi_paths=midi_paths)
14
+
15
+ # check output checkpoint folder
16
+ ####################################
17
+ # if you use "REMI-tempo-chord-checkpoint" for the pre-trained checkpoint
18
+ # please name your output folder as something with "chord"
19
+ # for example: my-love-chord, cute-doggy-chord, ...
20
+ # if use "REMI-tempo-checkpoint"
21
+ # for example: my-love, cute-doggy, ...
22
+ ####################################
23
+ output_checkpoint_folder = 'REMI-finetune' # your decision
24
+ if not os.path.exists(output_checkpoint_folder):
25
+ os.mkdir(output_checkpoint_folder)
26
+
27
+ # finetune
28
+ model.finetune(
29
+ training_data=training_data,
30
+ output_checkpoint_folder=output_checkpoint_folder)
31
+
32
+ ####################################
33
+ # after finetuning, please choose which checkpoint you want to try
34
+ # and change the checkpoint names you choose into "model"
35
+ # and copy the "dictionary.pkl" into the your output_checkpoint_folder
36
+ # ***** the same as the content format in "REMI-tempo-checkpoint" *****
37
+ # and then, you can use "main.py" to generate your own music!
38
+ # (do not forget to revise the checkpoint path to your own in "main.py")
39
+ ####################################
40
+
41
+ # close
42
+ model.close()
43
+
44
+ if __name__ == '__main__':
45
+ main()
input.midi ADDED
Binary file (3.75 kB). View file
 
main.py ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from model import PopMusicTransformer
2
+ import os
3
+ os.environ['CUDA_VISIBLE_DEVICES'] = '0'
4
+
5
+ def main():
6
+ # declare model
7
+ model = PopMusicTransformer(
8
+ checkpoint='REMI-tempo-checkpoint',
9
+ is_training=False)
10
+
11
+ # generate from scratch
12
+ model.generate(
13
+ n_target_bar=16,
14
+ temperature=1.2,
15
+ topk=5,
16
+ output_path='./result/from_scratch.midi',
17
+ prompt=None)
18
+
19
+ # generate continuation
20
+ model.generate(
21
+ n_target_bar=16,
22
+ temperature=1.2,
23
+ topk=5
24
+ output_path='./result/continuation.midi',
25
+ prompt='./data/evaluation/000.midi')
26
+
27
+ # close model
28
+ model.close()
29
+
30
+ if __name__ == '__main__':
31
+ main()
model.py ADDED
@@ -0,0 +1,294 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import tensorflow as tf
2
+ import numpy as np
3
+ import miditoolkit
4
+ import modules
5
+ import pickle
6
+ import utils
7
+ import time
8
+
9
+ class PopMusicTransformer(object):
10
+ ########################################
11
+ # initialize
12
+ ########################################
13
+ def __init__(self, checkpoint, is_training=False):
14
+ # load dictionary
15
+ self.dictionary_path = '{}/dictionary.pkl'.format(checkpoint)
16
+ self.event2word, self.word2event = pickle.load(open(self.dictionary_path, 'rb'))
17
+ # model settings
18
+ self.x_len = 512
19
+ self.mem_len = 512
20
+ self.n_layer = 12
21
+ self.d_embed = 512
22
+ self.d_model = 512
23
+ self.dropout = 0.1
24
+ self.n_head = 8
25
+ self.d_head = self.d_model // self.n_head
26
+ self.d_ff = 2048
27
+ self.n_token = len(self.event2word)
28
+ self.learning_rate = 0.0002
29
+ # load model
30
+ self.is_training = is_training
31
+ if self.is_training:
32
+ self.batch_size = 4
33
+ else:
34
+ self.batch_size = 1
35
+ self.checkpoint_path = '{}/model'.format(checkpoint)
36
+ self.load_model()
37
+
38
+ ########################################
39
+ # load model
40
+ ########################################
41
+ def load_model(self):
42
+ # placeholders
43
+ self.x = tf.compat.v1.placeholder(tf.int32, shape=[self.batch_size, None])
44
+ self.y = tf.compat.v1.placeholder(tf.int32, shape=[self.batch_size, None])
45
+ self.mems_i = [tf.compat.v1.placeholder(tf.float32, [self.mem_len, self.batch_size, self.d_model]) for _ in range(self.n_layer)]
46
+ # model
47
+ self.global_step = tf.compat.v1.train.get_or_create_global_step()
48
+ initializer = tf.compat.v1.initializers.random_normal(stddev=0.02, seed=None)
49
+ proj_initializer = tf.compat.v1.initializers.random_normal(stddev=0.01, seed=None)
50
+ with tf.compat.v1.variable_scope(tf.compat.v1.get_variable_scope()):
51
+ xx = tf.transpose(self.x, [1, 0])
52
+ yy = tf.transpose(self.y, [1, 0])
53
+ loss, self.logits, self.new_mem = modules.transformer(
54
+ dec_inp=xx,
55
+ target=yy,
56
+ mems=self.mems_i,
57
+ n_token=self.n_token,
58
+ n_layer=self.n_layer,
59
+ d_model=self.d_model,
60
+ d_embed=self.d_embed,
61
+ n_head=self.n_head,
62
+ d_head=self.d_head,
63
+ d_inner=self.d_ff,
64
+ dropout=self.dropout,
65
+ dropatt=self.dropout,
66
+ initializer=initializer,
67
+ proj_initializer=proj_initializer,
68
+ is_training=self.is_training,
69
+ mem_len=self.mem_len,
70
+ cutoffs=[],
71
+ div_val=-1,
72
+ tie_projs=[],
73
+ same_length=False,
74
+ clamp_len=-1,
75
+ input_perms=None,
76
+ target_perms=None,
77
+ head_target=None,
78
+ untie_r=False,
79
+ proj_same_dim=True)
80
+ self.avg_loss = tf.reduce_mean(loss)
81
+ # vars
82
+ all_vars = tf.compat.v1.trainable_variables()
83
+ grads = tf.gradients(self.avg_loss, all_vars)
84
+ grads_and_vars = list(zip(grads, all_vars))
85
+ all_trainable_vars = tf.reduce_sum([tf.reduce_prod(v.shape) for v in tf.compat.v1.trainable_variables()])
86
+ # optimizer
87
+ decay_lr = tf.compat.v1.train.cosine_decay(
88
+ self.learning_rate,
89
+ global_step=self.global_step,
90
+ decay_steps=400000,
91
+ alpha=0.004)
92
+ optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=decay_lr)
93
+ self.train_op = optimizer.apply_gradients(grads_and_vars, self.global_step)
94
+ # saver
95
+ self.saver = tf.compat.v1.train.Saver()
96
+ config = tf.compat.v1.ConfigProto(allow_soft_placement=True)
97
+ config.gpu_options.allow_growth = True
98
+ self.sess = tf.compat.v1.Session(config=config)
99
+ self.saver.restore(self.sess, self.checkpoint_path)
100
+
101
+ ########################################
102
+ # temperature sampling
103
+ ########################################
104
+ def temperature_sampling(self, logits, temperature, topk):
105
+ probs = np.exp(logits / temperature) / np.sum(np.exp(logits / temperature))
106
+ if topk == 1:
107
+ prediction = np.argmax(probs)
108
+ else:
109
+ sorted_index = np.argsort(probs)[::-1]
110
+ candi_index = sorted_index[:topk]
111
+ candi_probs = [probs[i] for i in candi_index]
112
+ # normalize probs
113
+ candi_probs /= sum(candi_probs)
114
+ # choose by predicted probs
115
+ prediction = np.random.choice(candi_index, size=1, p=candi_probs)[0]
116
+ return prediction
117
+
118
+ ########################################
119
+ # extract events for prompt continuation
120
+ ########################################
121
+ def extract_events(self, input_path):
122
+ note_items, tempo_items = utils.read_items(input_path)
123
+ note_items = utils.quantize_items(note_items)
124
+ max_time = note_items[-1].end
125
+ if 'chord' in self.checkpoint_path:
126
+ chord_items = utils.extract_chords(note_items)
127
+ items = chord_items + tempo_items + note_items
128
+ else:
129
+ items = tempo_items + note_items
130
+ groups = utils.group_items(items, max_time)
131
+ events = utils.item2event(groups)
132
+ return events
133
+
134
+ ########################################
135
+ # generate
136
+ ########################################
137
+ def generate(self, n_target_bar, temperature, topk, output_path, prompt=None):
138
+ # if prompt, load it. Or, random start
139
+ if prompt:
140
+ events = self.extract_events(prompt)
141
+ words = [[self.event2word['{}_{}'.format(e.name, e.value)] for e in events]]
142
+ words[0].append(self.event2word['Bar_None'])
143
+ else:
144
+ words = []
145
+ for _ in range(self.batch_size):
146
+ ws = [self.event2word['Bar_None']]
147
+ if 'chord' in self.checkpoint_path:
148
+ tempo_classes = [v for k, v in self.event2word.items() if 'Tempo Class' in k]
149
+ tempo_values = [v for k, v in self.event2word.items() if 'Tempo Value' in k]
150
+ chords = [v for k, v in self.event2word.items() if 'Chord' in k]
151
+ ws.append(self.event2word['Position_1/16'])
152
+ ws.append(np.random.choice(chords))
153
+ ws.append(self.event2word['Position_1/16'])
154
+ ws.append(np.random.choice(tempo_classes))
155
+ ws.append(np.random.choice(tempo_values))
156
+ else:
157
+ tempo_classes = [v for k, v in self.event2word.items() if 'Tempo Class' in k]
158
+ tempo_values = [v for k, v in self.event2word.items() if 'Tempo Value' in k]
159
+ ws.append(self.event2word['Position_1/16'])
160
+ ws.append(np.random.choice(tempo_classes))
161
+ ws.append(np.random.choice(tempo_values))
162
+ words.append(ws)
163
+ # initialize mem
164
+ batch_m = [np.zeros((self.mem_len, self.batch_size, self.d_model), dtype=np.float32) for _ in range(self.n_layer)]
165
+ # generate
166
+ original_length = len(words[0])
167
+ initial_flag = 1
168
+ current_generated_bar = 0
169
+ while current_generated_bar < n_target_bar:
170
+ # input
171
+ if initial_flag:
172
+ temp_x = np.zeros((self.batch_size, original_length))
173
+ for b in range(self.batch_size):
174
+ for z, t in enumerate(words[b]):
175
+ temp_x[b][z] = t
176
+ initial_flag = 0
177
+ else:
178
+ temp_x = np.zeros((self.batch_size, 1))
179
+ for b in range(self.batch_size):
180
+ temp_x[b][0] = words[b][-1]
181
+ # prepare feed dict
182
+ feed_dict = {self.x: temp_x}
183
+ for m, m_np in zip(self.mems_i, batch_m):
184
+ feed_dict[m] = m_np
185
+ # model (prediction)
186
+ _logits, _new_mem = self.sess.run([self.logits, self.new_mem], feed_dict=feed_dict)
187
+ # sampling
188
+ _logit = _logits[-1, 0]
189
+ word = self.temperature_sampling(
190
+ logits=_logit,
191
+ temperature=temperature,
192
+ topk=topk)
193
+ words[0].append(word)
194
+ # if bar event (only work for batch_size=1)
195
+ if word == self.event2word['Bar_None']:
196
+ current_generated_bar += 1
197
+ # re-new mem
198
+ batch_m = _new_mem
199
+ # write
200
+ if prompt:
201
+ utils.write_midi(
202
+ words=words[0][original_length:],
203
+ word2event=self.word2event,
204
+ output_path=output_path,
205
+ prompt_path=prompt)
206
+ else:
207
+ utils.write_midi(
208
+ words=words[0],
209
+ word2event=self.word2event,
210
+ output_path=output_path,
211
+ prompt_path=None)
212
+
213
+ ########################################
214
+ # prepare training data
215
+ ########################################
216
+ def prepare_data(self, midi_paths):
217
+ # extract events
218
+ all_events = []
219
+ for path in midi_paths:
220
+ events = self.extract_events(path)
221
+ all_events.append(events)
222
+ # event to word
223
+ all_words = []
224
+ for events in all_events:
225
+ words = []
226
+ for event in events:
227
+ e = '{}_{}'.format(event.name, event.value)
228
+ if e in self.event2word:
229
+ words.append(self.event2word[e])
230
+ else:
231
+ # OOV
232
+ if event.name == 'Note Velocity':
233
+ # replace with max velocity based on our training data
234
+ words.append(self.event2word['Note Velocity_21'])
235
+ else:
236
+ # something is wrong
237
+ # you should handle it for your own purpose
238
+ print('something is wrong! {}'.format(e))
239
+ all_words.append(words)
240
+ # to training data
241
+ self.group_size = 5
242
+ segments = []
243
+ for words in all_words:
244
+ pairs = []
245
+ for i in range(0, len(words)-self.x_len-1, self.x_len):
246
+ x = words[i:i+self.x_len]
247
+ y = words[i+1:i+self.x_len+1]
248
+ pairs.append([x, y])
249
+ pairs = np.array(pairs)
250
+ # abandon the last
251
+ for i in np.arange(0, len(pairs)-self.group_size, self.group_size*2):
252
+ data = pairs[i:i+self.group_size]
253
+ if len(data) == self.group_size:
254
+ segments.append(data)
255
+ segments = np.array(segments)
256
+ return segments
257
+
258
+ ########################################
259
+ # finetune
260
+ ########################################
261
+ def finetune(self, training_data, output_checkpoint_folder):
262
+ # shuffle
263
+ index = np.arange(len(training_data))
264
+ np.random.shuffle(index)
265
+ training_data = training_data[index]
266
+ num_batches = len(training_data) // self.batch_size
267
+ st = time.time()
268
+ for e in range(200):
269
+ total_loss = []
270
+ for i in range(num_batches):
271
+ segments = training_data[self.batch_size*i:self.batch_size*(i+1)]
272
+ batch_m = [np.zeros((self.mem_len, self.batch_size, self.d_model), dtype=np.float32) for _ in range(self.n_layer)]
273
+ for j in range(self.group_size):
274
+ batch_x = segments[:, j, 0, :]
275
+ batch_y = segments[:, j, 1, :]
276
+ # prepare feed dict
277
+ feed_dict = {self.x: batch_x, self.y: batch_y}
278
+ for m, m_np in zip(self.mems_i, batch_m):
279
+ feed_dict[m] = m_np
280
+ # run
281
+ _, gs_, loss_, new_mem_ = self.sess.run([self.train_op, self.global_step, self.avg_loss, self.new_mem], feed_dict=feed_dict)
282
+ batch_m = new_mem_
283
+ total_loss.append(loss_)
284
+ print('>>> Epoch: {}, Step: {}, Loss: {:.5f}, Time: {:.2f}'.format(e, gs_, loss_, time.time()-st))
285
+ self.saver.save(self.sess, '{}/model-{:03d}-{:.3f}'.format(output_checkpoint_folder, e, np.mean(total_loss)))
286
+ # stop
287
+ if np.mean(total_loss) <= 0.1:
288
+ break
289
+
290
+ ########################################
291
+ # close
292
+ ########################################
293
+ def close(self):
294
+ self.sess.close()
modules.py ADDED
@@ -0,0 +1,233 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import tensorflow as tf
2
+
3
+ def embedding_lookup(lookup_table, x):
4
+ return tf.compat.v1.nn.embedding_lookup(lookup_table, x)
5
+
6
+
7
+ def normal_embedding_lookup(x, n_token, d_embed, d_proj, initializer,
8
+ proj_initializer, scope='normal_embed', **kwargs):
9
+ emb_scale = d_proj ** 0.5
10
+ with tf.compat.v1.variable_scope(scope):
11
+ lookup_table = tf.compat.v1.get_variable('lookup_table', [n_token, d_embed], initializer=initializer)
12
+ y = embedding_lookup(lookup_table, x)
13
+ if d_proj != d_embed:
14
+ proj_W = tf.compat.v1.get_variable('proj_W', [d_embed, d_proj], initializer=proj_initializer)
15
+ y = tf.einsum('ibe,ed->ibd', y, proj_W)
16
+ else:
17
+ proj_W = None
18
+ ret_params = [lookup_table, proj_W]
19
+ y *= emb_scale
20
+ return y, ret_params
21
+
22
+
23
+ def normal_softmax(hidden, target, n_token, params, scope='normal_softmax', **kwargs):
24
+ def _logit(x, W, b, proj):
25
+ y = x
26
+ if proj is not None:
27
+ y = tf.einsum('ibd,ed->ibe', y, proj)
28
+ return tf.einsum('ibd,nd->ibn', y, W) + b
29
+
30
+ params_W, params_projs = params[0], params[1]
31
+
32
+ with tf.compat.v1.variable_scope(scope):
33
+ softmax_b = tf.compat.v1.get_variable('bias', [n_token], initializer=tf.zeros_initializer())
34
+ output = _logit(hidden, params_W, softmax_b, params_projs)
35
+ nll = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=target, logits=output)
36
+ return nll, output
37
+
38
+
39
+ def positional_embedding(pos_seq, inv_freq, bsz=None):
40
+ sinusoid_inp = tf.einsum('i,j->ij', pos_seq, inv_freq)
41
+ pos_emb = tf.concat([tf.sin(sinusoid_inp), tf.cos(sinusoid_inp)], -1)
42
+ if bsz is not None:
43
+ return tf.tile(pos_emb[:, None, :], [1, bsz, 1])
44
+ else:
45
+ return pos_emb[:, None, :]
46
+
47
+
48
+ def positionwise_FF(inp, d_model, d_inner, dropout, kernel_initializer,
49
+ scope='ff', is_training=True):
50
+ output = inp
51
+ with tf.compat.v1.variable_scope(scope):
52
+ output = tf.keras.layers.Dense(d_inner, activation=tf.nn.relu,
53
+ kernel_initializer=kernel_initializer, name='layer_1')(inp)
54
+ output = tf.keras.layers.Dropout(dropout, name='drop_1')(output, training=is_training)
55
+ output = tf.keras.layers.Dense(d_model, activation=tf.nn.relu,
56
+ kernel_initializer=kernel_initializer, name='layer_2')(output)
57
+ output = tf.keras.layers.Dropout(dropout, name='drop_2')(output, training=is_training)
58
+ output = tf.keras.layers.LayerNormalization(axis=-1)(output + inp)
59
+ return output
60
+
61
+
62
+ def _create_mask(qlen, mlen, same_length=False):
63
+ attn_mask = tf.ones([qlen, qlen])
64
+ mask_u = tf.linalg.band_part(attn_mask, 0, -1)
65
+ mask_dia = tf.linalg.band_part(attn_mask, 0, 0)
66
+ attn_mask_pad = tf.zeros([qlen, mlen])
67
+ ret = tf.concat([attn_mask_pad, mask_u - mask_dia], 1)
68
+ if same_length:
69
+ mask_l = tf.matrix_band_part(attn_mask, -1, 0)
70
+ ret = tf.concat([ret[:, :qlen] + mask_l - mask_dia, ret[:, qlen:]], 1)
71
+ return ret
72
+
73
+
74
+ def _cache_mem(curr_out, prev_mem, mem_len=None):
75
+ if mem_len is None or prev_mem is None:
76
+ new_mem = curr_out
77
+ elif mem_len == 0:
78
+ return prev_mem
79
+ else:
80
+ new_mem = tf.concat([prev_mem, curr_out], 0)[-mem_len:]
81
+ return tf.stop_gradient(new_mem)
82
+
83
+
84
+ def rel_shift(x):
85
+ x_size = tf.shape(x)
86
+ x = tf.pad(x, [[0, 0], [1, 0], [0, 0], [0, 0]])
87
+ x = tf.reshape(x, [x_size[1] + 1, x_size[0], x_size[2], x_size[3]])
88
+ x = tf.slice(x, [1, 0, 0, 0], [-1, -1, -1, -1])
89
+ x = tf.reshape(x, x_size)
90
+ return x
91
+
92
+
93
+ def rel_multihead_attn(w, r, r_w_bias, r_r_bias, attn_mask, mems, d_model,
94
+ n_head, d_head, dropout, dropatt, is_training,
95
+ kernel_initializer, scope='rel_attn'):
96
+ scale = 1 / (d_head ** 0.5)
97
+ with tf.compat.v1.variable_scope(scope):
98
+ qlen = tf.shape(w)[0]
99
+ rlen = tf.shape(r)[0]
100
+ bsz = tf.shape(w)[1]
101
+
102
+ cat = tf.concat([mems, w], 0) if mems is not None and mems.shape.ndims > 1 else w
103
+
104
+ w_heads = tf.keras.layers.Dense(3 * n_head * d_head, use_bias=False,
105
+ kernel_initializer=kernel_initializer, name='qkv')(cat)
106
+ r_head_k = tf.keras.layers.Dense(n_head * d_head, use_bias=False,
107
+ kernel_initializer=kernel_initializer, name='r')(r)
108
+
109
+ w_head_q, w_head_k, w_head_v = tf.split(w_heads, 3, -1)
110
+ w_head_q = w_head_q[-qlen:]
111
+
112
+ klen = tf.shape(w_head_k)[0]
113
+
114
+ w_head_q = tf.reshape(w_head_q, [qlen, bsz, n_head, d_head])
115
+ w_head_k = tf.reshape(w_head_k, [klen, bsz, n_head, d_head])
116
+ w_head_v = tf.reshape(w_head_v, [klen, bsz, n_head, d_head])
117
+
118
+ r_head_k = tf.reshape(r_head_k, [rlen, n_head, d_head])
119
+
120
+ rw_head_q = w_head_q + r_w_bias
121
+ rr_head_q = w_head_q + r_r_bias
122
+
123
+ AC = tf.einsum('ibnd,jbnd->ijbn', rw_head_q, w_head_k)
124
+ BD = tf.einsum('ibnd,jnd->ijbn', rr_head_q, r_head_k)
125
+ BD = rel_shift(BD)
126
+
127
+ attn_score = (AC + BD) * scale
128
+ attn_mask_t = attn_mask[:, :, None, None]
129
+ attn_score = attn_score * (1 - attn_mask_t) - 1e30 * attn_mask_t
130
+
131
+ attn_prob = tf.nn.softmax(attn_score, 1)
132
+ attn_prob = tf.keras.layers.Dropout(dropatt)(attn_prob, training=is_training)
133
+
134
+ attn_vec = tf.einsum('ijbn,jbnd->ibnd', attn_prob, w_head_v)
135
+ size_t = tf.shape(attn_vec)
136
+ attn_vec = tf.reshape(attn_vec, [size_t[0], size_t[1], n_head * d_head])
137
+
138
+ attn_out = tf.keras.layers.Dense(d_model, use_bias=False,
139
+ kernel_initializer=kernel_initializer, name='o')(attn_vec)
140
+ attn_out = tf.keras.layers.Dropout(dropout)(attn_out, training=is_training)
141
+ output = tf.keras.layers.LayerNormalization(axis=-1)(attn_out + w)
142
+ return output
143
+
144
+
145
+ def transformer(dec_inp, target, mems, n_token, n_layer, d_model, d_embed,
146
+ n_head, d_head, d_inner, dropout, dropatt,
147
+ initializer, is_training, proj_initializer=None,
148
+ mem_len=None, cutoffs=[], div_val=1, tie_projs=[],
149
+ same_length=False, clamp_len=-1,
150
+ input_perms=None, target_perms=None, head_target=None,
151
+ untie_r=False, proj_same_dim=True,
152
+ scope='transformer'):
153
+ """
154
+ cutoffs: a list of python int. Cutoffs for adaptive softmax.
155
+ tie_projs: a list of python bools. Whether to tie the projections.
156
+ perms: a list of tensors. Each tensor should of size [len, bsz, bin_size].
157
+ Only used in the adaptive setting.
158
+ """
159
+ new_mems = []
160
+ with tf.compat.v1.variable_scope(scope):
161
+ if untie_r:
162
+ r_w_bias = tf.compat.v1.get_variable('r_w_bias', [n_layer, n_head, d_head], initializer=initializer)
163
+ r_r_bias = tf.compat.v1.get_variable('r_r_bias', [n_layer, n_head, d_head], initializer=initializer)
164
+ else:
165
+ r_w_bias = tf.compat.v1.get_variable('r_w_bias', [n_head, d_head], initializer=initializer)
166
+ r_r_bias = tf.compat.v1.get_variable('r_r_bias', [n_head, d_head], initializer=initializer)
167
+
168
+ qlen = tf.shape(dec_inp)[0]
169
+ mlen = tf.shape(mems[0])[0] if mems is not None else 0
170
+ klen = qlen + mlen
171
+
172
+ if proj_initializer is None:
173
+ proj_initializer = initializer
174
+
175
+ embeddings, shared_params = normal_embedding_lookup(
176
+ x=dec_inp,
177
+ n_token=n_token,
178
+ d_embed=d_embed,
179
+ d_proj=d_model,
180
+ initializer=initializer,
181
+ proj_initializer=proj_initializer)
182
+
183
+ attn_mask = _create_mask(qlen, mlen, same_length)
184
+
185
+ pos_seq = tf.range(klen - 1, -1, -1.0)
186
+ if clamp_len > 0:
187
+ pos_seq = tf.minimum(pos_seq, clamp_len)
188
+ inv_freq = 1 / (10000 ** (tf.range(0, d_model, 2.0) / d_model))
189
+ pos_emb = positional_embedding(pos_seq, inv_freq)
190
+
191
+ output = tf.keras.layers.Dropout(rate=dropout)(embeddings, training=is_training)
192
+ pos_emb = tf.keras.layers.Dropout(rate=dropout)(pos_emb, training=is_training)
193
+
194
+ if mems is None:
195
+ mems = [None] * n_layer
196
+
197
+ for i in range(n_layer):
198
+ # cache new mems
199
+ new_mems.append(_cache_mem(output, mems[i], mem_len))
200
+
201
+ with tf.compat.v1.variable_scope('layer_{}'.format(i)):
202
+ output = rel_multihead_attn(
203
+ w=output,
204
+ r=pos_emb,
205
+ r_w_bias=r_w_bias if not untie_r else r_w_bias[i],
206
+ r_r_bias=r_r_bias if not untie_r else r_r_bias[i],
207
+ attn_mask=attn_mask,
208
+ mems=mems[i],
209
+ d_model=d_model,
210
+ n_head=n_head,
211
+ d_head=d_head,
212
+ dropout=dropout,
213
+ dropatt=dropatt,
214
+ is_training=is_training,
215
+ kernel_initializer=initializer)
216
+
217
+ output = positionwise_FF(
218
+ inp=output,
219
+ d_model=d_model,
220
+ d_inner=d_inner,
221
+ dropout=dropout,
222
+ kernel_initializer=initializer,
223
+ is_training=is_training)
224
+
225
+ output = tf.keras.layers.Dropout(dropout)(output, training=is_training)
226
+
227
+ loss, logits = normal_softmax(
228
+ hidden=output,
229
+ target=target,
230
+ n_token=n_token,
231
+ params=shared_params)
232
+
233
+ return loss, logits, new_mems
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ miditoolkit
2
+ tensorflow-gpu==1.14.0
3
+ gradio
4
+ torchtext
result/continuation.midi ADDED
Binary file (1.62 kB). View file
 
result/from_scratch.midi ADDED
Binary file (1.73 kB). View file
 
utils.py ADDED
@@ -0,0 +1,348 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import chord_recognition
2
+ import numpy as np
3
+ import miditoolkit
4
+ import copy
5
+
6
+ # parameters for input
7
+ DEFAULT_VELOCITY_BINS = np.linspace(0, 128, 32+1, dtype=np.int)
8
+ DEFAULT_FRACTION = 16
9
+ DEFAULT_DURATION_BINS = np.arange(60, 3841, 60, dtype=int)
10
+ DEFAULT_TEMPO_INTERVALS = [range(30, 90), range(90, 150), range(150, 210)]
11
+
12
+ # parameters for output
13
+ DEFAULT_RESOLUTION = 480
14
+
15
+ # define "Item" for general storage
16
+ class Item(object):
17
+ def __init__(self, name, start, end, velocity, pitch):
18
+ self.name = name
19
+ self.start = start
20
+ self.end = end
21
+ self.velocity = velocity
22
+ self.pitch = pitch
23
+
24
+ def __repr__(self):
25
+ return 'Item(name={}, start={}, end={}, velocity={}, pitch={})'.format(
26
+ self.name, self.start, self.end, self.velocity, self.pitch)
27
+
28
+ # read notes and tempo changes from midi (assume there is only one track)
29
+ def read_items(file_path):
30
+ midi_obj = miditoolkit.midi.parser.MidiFile(file_path)
31
+ # note
32
+ note_items = []
33
+ notes = midi_obj.instruments[0].notes
34
+ notes.sort(key=lambda x: (x.start, x.pitch))
35
+ for note in notes:
36
+ note_items.append(Item(
37
+ name='Note',
38
+ start=note.start,
39
+ end=note.end,
40
+ velocity=note.velocity,
41
+ pitch=note.pitch))
42
+ note_items.sort(key=lambda x: x.start)
43
+ # tempo
44
+ tempo_items = []
45
+ for tempo in midi_obj.tempo_changes:
46
+ tempo_items.append(Item(
47
+ name='Tempo',
48
+ start=tempo.time,
49
+ end=None,
50
+ velocity=None,
51
+ pitch=int(tempo.tempo)))
52
+ tempo_items.sort(key=lambda x: x.start)
53
+ # expand to all beat
54
+ max_tick = tempo_items[-1].start
55
+ existing_ticks = {item.start: item.pitch for item in tempo_items}
56
+ wanted_ticks = np.arange(0, max_tick+1, DEFAULT_RESOLUTION)
57
+ output = []
58
+ for tick in wanted_ticks:
59
+ if tick in existing_ticks:
60
+ output.append(Item(
61
+ name='Tempo',
62
+ start=tick,
63
+ end=None,
64
+ velocity=None,
65
+ pitch=existing_ticks[tick]))
66
+ else:
67
+ output.append(Item(
68
+ name='Tempo',
69
+ start=tick,
70
+ end=None,
71
+ velocity=None,
72
+ pitch=output[-1].pitch))
73
+ tempo_items = output
74
+ return note_items, tempo_items
75
+
76
+ # quantize items
77
+ def quantize_items(items, ticks=120):
78
+ # grid
79
+ grids = np.arange(0, items[-1].start, ticks, dtype=int)
80
+ # process
81
+ for item in items:
82
+ index = np.argmin(abs(grids - item.start))
83
+ shift = grids[index] - item.start
84
+ item.start += shift
85
+ item.end += shift
86
+ return items
87
+
88
+ # extract chord
89
+ def extract_chords(items):
90
+ method = chord_recognition.MIDIChord()
91
+ chords = method.extract(notes=items)
92
+ output = []
93
+ for chord in chords:
94
+ output.append(Item(
95
+ name='Chord',
96
+ start=chord[0],
97
+ end=chord[1],
98
+ velocity=None,
99
+ pitch=chord[2].split('/')[0]))
100
+ return output
101
+
102
+ # group items
103
+ def group_items(items, max_time, ticks_per_bar=DEFAULT_RESOLUTION*4):
104
+ items.sort(key=lambda x: x.start)
105
+ downbeats = np.arange(0, max_time+ticks_per_bar, ticks_per_bar)
106
+ groups = []
107
+ for db1, db2 in zip(downbeats[:-1], downbeats[1:]):
108
+ insiders = []
109
+ for item in items:
110
+ if (item.start >= db1) and (item.start < db2):
111
+ insiders.append(item)
112
+ overall = [db1] + insiders + [db2]
113
+ groups.append(overall)
114
+ return groups
115
+
116
+ # define "Event" for event storage
117
+ class Event(object):
118
+ def __init__(self, name, time, value, text):
119
+ self.name = name
120
+ self.time = time
121
+ self.value = value
122
+ self.text = text
123
+
124
+ def __repr__(self):
125
+ return 'Event(name={}, time={}, value={}, text={})'.format(
126
+ self.name, self.time, self.value, self.text)
127
+
128
+ # item to event
129
+ def item2event(groups):
130
+ events = []
131
+ n_downbeat = 0
132
+ for i in range(len(groups)):
133
+ if 'Note' not in [item.name for item in groups[i][1:-1]]:
134
+ continue
135
+ bar_st, bar_et = groups[i][0], groups[i][-1]
136
+ n_downbeat += 1
137
+ events.append(Event(
138
+ name='Bar',
139
+ time=None,
140
+ value=None,
141
+ text='{}'.format(n_downbeat)))
142
+ for item in groups[i][1:-1]:
143
+ # position
144
+ flags = np.linspace(bar_st, bar_et, DEFAULT_FRACTION, endpoint=False)
145
+ index = np.argmin(abs(flags-item.start))
146
+ events.append(Event(
147
+ name='Position',
148
+ time=item.start,
149
+ value='{}/{}'.format(index+1, DEFAULT_FRACTION),
150
+ text='{}'.format(item.start)))
151
+ if item.name == 'Note':
152
+ # velocity
153
+ velocity_index = np.searchsorted(
154
+ DEFAULT_VELOCITY_BINS,
155
+ item.velocity,
156
+ side='right') - 1
157
+ events.append(Event(
158
+ name='Note Velocity',
159
+ time=item.start,
160
+ value=velocity_index,
161
+ text='{}/{}'.format(item.velocity, DEFAULT_VELOCITY_BINS[velocity_index])))
162
+ # pitch
163
+ events.append(Event(
164
+ name='Note On',
165
+ time=item.start,
166
+ value=item.pitch,
167
+ text='{}'.format(item.pitch)))
168
+ # duration
169
+ duration = item.end - item.start
170
+ index = np.argmin(abs(DEFAULT_DURATION_BINS-duration))
171
+ events.append(Event(
172
+ name='Note Duration',
173
+ time=item.start,
174
+ value=index,
175
+ text='{}/{}'.format(duration, DEFAULT_DURATION_BINS[index])))
176
+ elif item.name == 'Chord':
177
+ events.append(Event(
178
+ name='Chord',
179
+ time=item.start,
180
+ value=item.pitch,
181
+ text='{}'.format(item.pitch)))
182
+ elif item.name == 'Tempo':
183
+ tempo = item.pitch
184
+ if tempo in DEFAULT_TEMPO_INTERVALS[0]:
185
+ tempo_style = Event('Tempo Class', item.start, 'slow', None)
186
+ tempo_value = Event('Tempo Value', item.start,
187
+ tempo-DEFAULT_TEMPO_INTERVALS[0].start, None)
188
+ elif tempo in DEFAULT_TEMPO_INTERVALS[1]:
189
+ tempo_style = Event('Tempo Class', item.start, 'mid', None)
190
+ tempo_value = Event('Tempo Value', item.start,
191
+ tempo-DEFAULT_TEMPO_INTERVALS[1].start, None)
192
+ elif tempo in DEFAULT_TEMPO_INTERVALS[2]:
193
+ tempo_style = Event('Tempo Class', item.start, 'fast', None)
194
+ tempo_value = Event('Tempo Value', item.start,
195
+ tempo-DEFAULT_TEMPO_INTERVALS[2].start, None)
196
+ elif tempo < DEFAULT_TEMPO_INTERVALS[0].start:
197
+ tempo_style = Event('Tempo Class', item.start, 'slow', None)
198
+ tempo_value = Event('Tempo Value', item.start, 0, None)
199
+ elif tempo > DEFAULT_TEMPO_INTERVALS[2].stop:
200
+ tempo_style = Event('Tempo Class', item.start, 'fast', None)
201
+ tempo_value = Event('Tempo Value', item.start, 59, None)
202
+ events.append(tempo_style)
203
+ events.append(tempo_value)
204
+ return events
205
+
206
+ #############################################################################################
207
+ # WRITE MIDI
208
+ #############################################################################################
209
+ def word_to_event(words, word2event):
210
+ events = []
211
+ for word in words:
212
+ event_name, event_value = word2event.get(word).split('_')
213
+ events.append(Event(event_name, None, event_value, None))
214
+ return events
215
+
216
+ def write_midi(words, word2event, output_path, prompt_path=None):
217
+ events = word_to_event(words, word2event)
218
+ # get downbeat and note (no time)
219
+ temp_notes = []
220
+ temp_chords = []
221
+ temp_tempos = []
222
+ for i in range(len(events)-3):
223
+ if events[i].name == 'Bar' and i > 0:
224
+ temp_notes.append('Bar')
225
+ temp_chords.append('Bar')
226
+ temp_tempos.append('Bar')
227
+ elif events[i].name == 'Position' and \
228
+ events[i+1].name == 'Note Velocity' and \
229
+ events[i+2].name == 'Note On' and \
230
+ events[i+3].name == 'Note Duration':
231
+ # start time and end time from position
232
+ position = int(events[i].value.split('/')[0]) - 1
233
+ # velocity
234
+ index = int(events[i+1].value)
235
+ velocity = int(DEFAULT_VELOCITY_BINS[index])
236
+ # pitch
237
+ pitch = int(events[i+2].value)
238
+ # duration
239
+ index = int(events[i+3].value)
240
+ duration = DEFAULT_DURATION_BINS[index]
241
+ # adding
242
+ temp_notes.append([position, velocity, pitch, duration])
243
+ elif events[i].name == 'Position' and events[i+1].name == 'Chord':
244
+ position = int(events[i].value.split('/')[0]) - 1
245
+ temp_chords.append([position, events[i+1].value])
246
+ elif events[i].name == 'Position' and \
247
+ events[i+1].name == 'Tempo Class' and \
248
+ events[i+2].name == 'Tempo Value':
249
+ position = int(events[i].value.split('/')[0]) - 1
250
+ if events[i+1].value == 'slow':
251
+ tempo = DEFAULT_TEMPO_INTERVALS[0].start + int(events[i+2].value)
252
+ elif events[i+1].value == 'mid':
253
+ tempo = DEFAULT_TEMPO_INTERVALS[1].start + int(events[i+2].value)
254
+ elif events[i+1].value == 'fast':
255
+ tempo = DEFAULT_TEMPO_INTERVALS[2].start + int(events[i+2].value)
256
+ temp_tempos.append([position, tempo])
257
+ # get specific time for notes
258
+ ticks_per_beat = DEFAULT_RESOLUTION
259
+ ticks_per_bar = DEFAULT_RESOLUTION * 4 # assume 4/4
260
+ notes = []
261
+ current_bar = 0
262
+ for note in temp_notes:
263
+ if note == 'Bar':
264
+ current_bar += 1
265
+ else:
266
+ position, velocity, pitch, duration = note
267
+ # position (start time)
268
+ current_bar_st = current_bar * ticks_per_bar
269
+ current_bar_et = (current_bar + 1) * ticks_per_bar
270
+ flags = np.linspace(current_bar_st, current_bar_et, DEFAULT_FRACTION, endpoint=False, dtype=int)
271
+ st = flags[position]
272
+ # duration (end time)
273
+ et = st + duration
274
+ notes.append(miditoolkit.Note(velocity, pitch, st, et))
275
+ # get specific time for chords
276
+ if len(temp_chords) > 0:
277
+ chords = []
278
+ current_bar = 0
279
+ for chord in temp_chords:
280
+ if chord == 'Bar':
281
+ current_bar += 1
282
+ else:
283
+ position, value = chord
284
+ # position (start time)
285
+ current_bar_st = current_bar * ticks_per_bar
286
+ current_bar_et = (current_bar + 1) * ticks_per_bar
287
+ flags = np.linspace(current_bar_st, current_bar_et, DEFAULT_FRACTION, endpoint=False, dtype=int)
288
+ st = flags[position]
289
+ chords.append([st, value])
290
+ # get specific time for tempos
291
+ tempos = []
292
+ current_bar = 0
293
+ for tempo in temp_tempos:
294
+ if tempo == 'Bar':
295
+ current_bar += 1
296
+ else:
297
+ position, value = tempo
298
+ # position (start time)
299
+ current_bar_st = current_bar * ticks_per_bar
300
+ current_bar_et = (current_bar + 1) * ticks_per_bar
301
+ flags = np.linspace(current_bar_st, current_bar_et, DEFAULT_FRACTION, endpoint=False, dtype=int)
302
+ st = flags[position]
303
+ tempos.append([int(st), value])
304
+ # write
305
+ if prompt_path:
306
+ midi = miditoolkit.midi.parser.MidiFile(prompt_path)
307
+ #
308
+ last_time = DEFAULT_RESOLUTION * 4 * 4
309
+ # note shift
310
+ for note in notes:
311
+ note.start += last_time
312
+ note.end += last_time
313
+ midi.instruments[0].notes.extend(notes)
314
+ # tempo changes
315
+ temp_tempos = []
316
+ for tempo in midi.tempo_changes:
317
+ if tempo.time < DEFAULT_RESOLUTION*4*4:
318
+ temp_tempos.append(tempo)
319
+ else:
320
+ break
321
+ for st, bpm in tempos:
322
+ st += last_time
323
+ temp_tempos.append(miditoolkit.midi.containers.TempoChange(bpm, st))
324
+ midi.tempo_changes = temp_tempos
325
+ # write chord into marker
326
+ if len(temp_chords) > 0:
327
+ for c in chords:
328
+ midi.markers.append(
329
+ miditoolkit.midi.containers.Marker(text=c[1], time=c[0]+last_time))
330
+ else:
331
+ midi = miditoolkit.midi.parser.MidiFile()
332
+ midi.ticks_per_beat = DEFAULT_RESOLUTION
333
+ # write instrument
334
+ inst = miditoolkit.midi.containers.Instrument(0, is_drum=False)
335
+ inst.notes = notes
336
+ midi.instruments.append(inst)
337
+ # write tempo
338
+ tempo_changes = []
339
+ for st, bpm in tempos:
340
+ tempo_changes.append(miditoolkit.midi.containers.TempoChange(bpm, st))
341
+ midi.tempo_changes = tempo_changes
342
+ # write chord into marker
343
+ if len(temp_chords) > 0:
344
+ for c in chords:
345
+ midi.markers.append(
346
+ miditoolkit.midi.containers.Marker(text=c[1], time=c[0]))
347
+ # write
348
+ midi.dump(output_path)