Output more than one instrument using Music 21 and Python

Output more than one instrument using Music 21 and Python - python

I am using Python and Music21 to code an algorithm that composes melodies from input music files of violin accompanied by piano pieces.My problem is that when I input a midi file that has two instruments, the output is only in one instrument. I can currently change the output instrument to a guitar, trumpet etc. even though those instruments are not present in my original input files. I would like to know whether I could write some code that identifies the instruments in the input files and outputs those specific instruments. Alternatively, is there any way that I could code for two output instruments rather than one? I have tired to copy the existing code with another instrument but the algorithm only outputs the last instrument detected in the code.Below is my current running code:
def convert_to_midi(prediction_output):
offset=0
output_notes=[]
#Create note and chord objects based on the values generated by the model
for pattern in prediction_output:
#Pattern is a chord
if ('.' in pattern) or pattern.isdigit():
notes_in_chord=pattern.split('.')
notes=[]
for current_note in notes_in_chord:
output_notes.append(instrument.Guitar())
cn=int(current_note)
new_note=note.Note(cn)
notes.append(new_note)
new_chord=chord.Chord(notes)
new_chord.offset=offset
output_notes.append(new_note)
#Pattern is a note
else:
output_notes.append(instrument.Guitar())
new_note=note.Note(pattern)
new_note.offset=offset
output_notes.append(new_note)

Instrument objects go directly into the Stream object, not on a Note, and each Part can have only one Instrument object active at a time.

Related

How to get the list of matched featured names along with the predict_prob in CalibratedClassifierCV?

I am trying to find the profanity score of a given text which is received on the chats.
For this is I went through a couple of python(base) libraries and found some of the relevant ones as:
profanity-check
alt-profanity-check -- (currently using)
profanity-filter
detoxify
Now, The one which I am using (profanity-check) is giving me proper results when using
predict and predict_prob against the calibrated_classifier used underhood after training.
The problem is that I am unable to identify the words which were used to give the prediction or calculate the probability. In short the list of feature names (profane words) used in the test data when passed as an input.
I know there are no methods to return the same, but I would like to fork and use the library.
I wanted to understand if we can add something to this place (edit) to create a method for the same.
e.g
text = ["this is crap"]
predict([text]) - array([1])
predict_prob([text]) - array([0.99868968])
> predict_words([text]) - array(["crap"]) ---- (NEED THIS)

Replace a value on one data set with a value from another data set with a dynamic lookup

This question relates primarly to Alteryx, however if it can be done in Python, or R in Alteryx workflow using the R tool then that would work as well.
I have two data sets.
Address (contains address information: Line1, Line2, City, State, Zip)
USPS (contains USPS abbreviations: Street to ST, Boulevard to BLVD, etc.)
Goal: Look at the string on the Address data set for Line1. IF it CONTAINS one of the types of streets in the USPS data set, I want to replace that part of the string with its proper abbreviation which is in a different column of the USPS data set.
Example, 123 Main Street would become 123 Main St
What I have tried:
Imported the two data sets.
Union the two data sets with the instruction of Output All Fields for When Fields Differ.
Added a formula, but this is where I am getting stuck. So far it reads:
if [Addr1] Contains(Sting, Target)
Not sure how to have it look in the USPS for one of the values. I am also not certain if this sort of dynamic lookup can take place.
If this can be done in python (I know very basic Python so I don't have code for this yet because I do not know where to start other than importing the data) I can use python within Alteryx.
Any assistance would be great. Please let me know if you need additional information.
Thank you in advance.

Use the Find Replace tool in Alteryx. This tool is akin to a lookup. Furthermore, use the Alteryx Community as a go to for these types of questions.
Input the Address dataset into the top anchor of the Find Replace tool and the USPS dataset into the bottom anchor. You'll want to find any part of the address field using the lookup field and replace it with the abbreviation field. If you need to do this across several fields in the Address dataset, then you could replicate this logic or you could use a Record ID tool, Transpose, run this logic on one field, and then Cross Tab back to the original schema. It's an advanced recipe that you'll want to master in Alteryx.
https://help.alteryx.com/current/FindReplace.htm

The overall logic that can be used is here: Using str_detect (or some other function) and some way to loop through a list to essentially perform a vlookup
However, in order to expand to Alteryx, you would need to add the Alteryx R tool. Also, some of the code would need to be changed to use the syntax that Alteryx likes.
read in the data with:
read.Alteryx('#Link Number', mode = 'data.frame')
After, the above linked question will provide the overall framework for the logic. Reiterated here:
usps[] = lapply(usps, as.character)
##Copies the original address data to a new column that will
##be altered. Preserves the orignal formatting for rollback
##if necessary
vendorData$new_addr1 = as.character(vendorData$Addr1)
##Loops through the dictionary replacing all of the common names
##with their USPS approved abbreviations for the Addr1 field.
for(i in 1:nrow(usps)) {
vendorData$new_addr1 = str_replace_all(
vendorData$new_addr1,
pattern = paste0("\\b", usps$Abbreviation[i], "\\b"),
replacement = usps$USPS_Abbrv_updated[i]
)
}
Finally, in order to be able to see the output, we would need to write a statement that will output it in one of the 5 output slots the R tool has. Here is the code for that:
write.Alteryx(data, #)

How do I read a midi file, change its instrument, and write it back?

I want to parse an already existing .mid file, change its instrument, from 'acoustic grand piano' to 'violin' for example, and save it back or as another .mid file.
From what I saw in the documentation, the instrument gets altered with a program_change or patch_change directive but I cannot find any library that does this in MIDI files that exist already. They all seem to support it only MIDI files created from scratch.

The MIDI package will do this for you, but the exact approach depends on the original contents of the midi file.
A midi file consists of one or more tracks, and each track is a sequence of events on any of sixteen channels, such as Note Off, Note On, Program Change etc. The last of these will change the instrument assigned to a channel, and that is what you need to change or add.
Without any Program Change events at all, a channel will use program number (voice number) zero, which is an acoustic grand piano. If you want to change the instrument for such a channel then all you need to do is add a new Program Change event for this channel at the beginning of the track.
However if a channel already has a Program Change event then adding a new one at the beginning will have no effect because it is immediately overridden by the pre-existing one. In this case you will have to change the parameters of the existing event to use the instrument that you want.
Things could be even more complicated if there are originally several Program Change events for a channel, meaning that the instrument changes throughout the track. This is unusual, but if you come across a file like this you will have to decide how you want to change it.
Supposing you have a very simple midi file with a single track, one channel, and no existing Program Change events. This program creates a new MIDI::Opus object from the file, accesses the list of tracks (with only a single member), and takes a reference to the list of the first track's events. Then a new Program Change event (this module calls it patch_change) for channel 0 is unshifted onto the beginning of the event list. The new event has a program number of 40 - violin - so this channel will now be played with a violin instead of a piano.
With multiple tracks, multiple channels, and existing Program Change events the task becomes more complex, but the principle is the same - decide what needs to be done and alter the list of events as necessary.
use strict;
use warnings;
use MIDI;
my $opus = MIDI::Opus->new( { from_file => 'song.mid' } );
my $tracks = $opus->tracks_r;
my $track0_events = $tracks->[0]->events_r;
unshift #$track0_events, ['patch_change', 0, 0, 40];
$opus->write_to_file('newsong.mid');

Use the music21 library (plugging my own system, hope that's okay). If there are patches defined in the parts, do:
from music21 import converter,instrument # or import *
s = converter.parse('/Users/cuthbert/Desktop/oldfilename.mid')
for el in s.recurse():
if 'Instrument' in el.classes: # or 'Piano'
el.activeSite.replace(el, instrument.Violin())
s.write('midi', '/Users/cuthbert/Desktop/newfilename.mid')
or if there are no patch changes currently defined:
from music21 import converter,instrument # or import *
s = converter.parse('/Users/cuthbert/Desktop/oldfilename.mid')
for p in s.parts:
p.insert(0, instrument.Violin())
s.write('midi', '/Users/cuthbert/Desktop/newfilename.mid')

how to iterate over all the objects in a PDF page and check which ones are text objects?

I want to iterate over all the objects in a page of a pdf using pypdf.
I also want to check that what is the type of the object, whether it is text or graphics.
A code snippet would be a great help.
Thanks a lot

I think that PyPDF is not the correct tool for the job. What you need is to parse the page itself (for which PyPDF has limited support, see the API documentation), and then to be able to save the results in another PDF object after changing some of the objects.
You might decompress the PDF using pdftk, and this would allow you to use pdfrw.
However, from what you write,
My ultimate goal is to color each text object differently.
a "text object" may be a quite complex object made up of (e.g.) different lines in different paragraphs. This might be, and you might see it as, a single entity. In this entity there might be already be several different text-color commands.
For example you might have a single stream with this sequence of text (this is written in "internal" language):
12.84 0 Td(S)Tj
0.08736 Tc
9 0 Td(e)Tj
0.06816 Tc
0.5 g
7.55999 0 Td(qu)Tj
0.08736 Tc
1 g
16.5599 0 Td(e)Tj
0.06816 Tc
7.55999 0 Td(n)Tj
0.08736 Tc
8.27996 0 Td(c)Tj
-0.03264 Tc
0.13632 Tw
7.55999 0 Td(e )Tj
0.06816 Tc
0 Tw
This might write "Sequence". It's actually made up of seven text subobjects, and there is no library I know of that can "decrypt" the stream into its component subobjects, much less assign to them the proper attributes (which in PDF descend from graphics state, while in any hierarchical structure such as XML would probably be associated to the single node, maybe through inheritance).
More: the stream might include non-text commands (e.g. lines). Then changing the "text" stroking color would actually change also non textual objects' color.
A library should provide you a level of detail access similar to that achieved by directly reading the text stream; so doing this through a library seems unlikely.
Since this is word processing work, you might look into the possibility of converting PDF to OpenOffice (using the PDF Import extension), manipulating it through OOo python, then exporting it back to PDF from within OpenOffice itself.
Beware, however, for there be dragons: the documentation is sketchy, and the interface is sometimes unstable. Accessing "text" might not be practical (the more so, since text will be available to you only on a line by line basis).
Another possibility (again, not for the faint of heart) is to decode the PDF yourself. Start by getting it in uncompressed format through pdftk. This will yield a header followed by a stream of objects in the form
INDEX R obj
<<
COMMANDS OR DATA
>>
[ stream
STREAM OF TEXT
endstream ]
endobj
You can read the stream, and for each object:
If COMMANDS OR DATA is only /Length length, it is likely a text stream. Else GOTO 3.
Parse the object (see below). If length changes, remember to update /Length appropriately.
Note the current output file offset, save it in XREF[i] ("reference offset for the i-th object"), and save it to the output file.
At the end of objects you will find a XREF object, wherein each object is indicated with the file offset at which it resides. These offsets (10-digits numbers) will have to be rewritten according to the new offsets you saved in XREF. The start of this object shall go into the startxref at the end of the PDF file.
(To debug, start by writing a routine that copies all objects without modifications. It must recalculate xrefs and offsets, and still yield a PDF object identical to the original).
The PDF thus obtained can be recompressed by pdftk to save space.
As regards the PDF textual object parsing, you basically check it line by line looking for text output commands (See PDF Reference 5.3.2). Mostly the command you'll see will be Tj:
9.95999 0 Td(Hello, world)Tj
and color changing commands (see #4.5.1; again the most used are g and rg.)
1 g # Sets color to black (1 in colorspace Gray)
1 0 0 rg # Sets color to red (1,0,0 in colorspace RGB)
You will then keep track of whatever color we're using, and might for example include each Tj command between a couple of RG commands of your choosing - one that sets your text color, one that restores the original. This way you will be sure that the graphic state does not "spill" to any nearby objects, lines, etc.; it will increase the object Length and also make the resulting PDF a little bit slower (but not very much. You might not even notice).

PDF structure is very complex. Another way is to export text than parse it if you want only text.
iterate on each page and use extractText on it

Extracting move information from a pgn file on Python

How do I go about extracting move information from a pgn file on Python? I'm new to programming and any help would be appreciated.

Try pgnparser.
Example code:
import pgn
import sys
f = open(sys.argv[1])
pgn_text = f.read()
f.close()
games = pgn.loads(pgn_text)
for game in games:
print game.moves

#Dennis Golomazov
I like what Denis did above. To add on to what he did, if you want to extract move information from more than 1 game in a png file, say like in games in chess database png file, use chess.pgn.
import chess.pgn
png_folder = open('sample.pgn')
current_game = chess.pgn.read_game(png_folder)
png_text = str(current_game.mainline_moves())
read_game() method acts as an iterator so calling it again will grab the next game in the pgn.

I can't give you any Python-specific directions, but I wrote a PGN converter recently in java, so I'll try offer some advice. The main disadvantage of Miku's link is the site doesn't allow for variance in .pgn files, which every site seems to vary slightly on the exact format.
Some .pgn have the move number attached to the move itself (1.e4 instead of 1. e4) so if you tokenise the string, you could check the placement of the dot since it only occurs in move numbers.
Work out all the different move combinations you can have. If a move is 5 characters long it could be 0-0-0 (queenside castles), Nge2+ (Knight from g to e2 with check(+)/ checkmate(#)), Rexb5 (Rook on e takes b5).
The longest string a move could be is 7 characters (for when you must specify origin rank AND file AND a capture AND with check). The shortest is 2 characters (a pawn advance).
Plan early for castling and en passant moves. You may realise too late that the way you have built your program doesn't easily adapt for them.
The details given at the start(ELO ratings, location, etc.) vary from file to file.

I dont have PGN parser for python but You can get source code of PGN parser for XCode from this place it can be of assistance

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.