Replace multiple strings in XML using a key->value pair in a CSV file











up vote
0
down vote

favorite












I have a dump from our application server which contains XML of multiple strings. I am interested in the userID, which is embedded in the XML tags and in the format of (lasfir1) as in the XML examples below:



<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>lasfir1</string>
</row>

<row>
<integer>1536</integer>
<string>lasfir1</string>
<integer>235</integer>
<string>USER</string>
</row>

<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>lisfar1</string>
</row>


The requirement is to convert the string "lasfir1" only into its equivalent Email ID, which are available in another CSV (text) file which has key->value pairing of the userID and Email ID:



FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1


The XML may not always be the same, but the string will be the one to search for, not the pattern of what is ahead or behind it.



Is there some simple way to read the key->value pair (in the CSV file), check if the key (userID) exists in the XML file and then replace it with the 'value' (Email ID)



This is required for a set of 300+ userID and Email ID combinations, all of which might not be in the XML.










share|improve this question
























  • Are you looking for a programming language specific answer ? Please state the details of language, framework you are using for rest of the application?
    – Rohit Nandi
    Nov 20 at 7:18










  • I am not sure if this could be programming language specific, as there might be different solutions in different langs. I have tried with sed and perl, but it is a tedious process, as reading each value from the CSV and then searching in a 17M line file multiple times is very resource intensive.
    – gagneet
    Nov 20 at 22:09










  • there are 3 <string> tags from which we are not able to distinguish the id that you are mentioning.
    – stack0114106
    Nov 21 at 4:03










  • @stack0114106 the userID field can come up anywhere in the file, and as I have mentioned, there is no particular pattern that I have been able to use to search, except use the string from the CSV file to find the string in the XML file.
    – gagneet
    Nov 21 at 4:36















up vote
0
down vote

favorite












I have a dump from our application server which contains XML of multiple strings. I am interested in the userID, which is embedded in the XML tags and in the format of (lasfir1) as in the XML examples below:



<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>lasfir1</string>
</row>

<row>
<integer>1536</integer>
<string>lasfir1</string>
<integer>235</integer>
<string>USER</string>
</row>

<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>lisfar1</string>
</row>


The requirement is to convert the string "lasfir1" only into its equivalent Email ID, which are available in another CSV (text) file which has key->value pairing of the userID and Email ID:



FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1


The XML may not always be the same, but the string will be the one to search for, not the pattern of what is ahead or behind it.



Is there some simple way to read the key->value pair (in the CSV file), check if the key (userID) exists in the XML file and then replace it with the 'value' (Email ID)



This is required for a set of 300+ userID and Email ID combinations, all of which might not be in the XML.










share|improve this question
























  • Are you looking for a programming language specific answer ? Please state the details of language, framework you are using for rest of the application?
    – Rohit Nandi
    Nov 20 at 7:18










  • I am not sure if this could be programming language specific, as there might be different solutions in different langs. I have tried with sed and perl, but it is a tedious process, as reading each value from the CSV and then searching in a 17M line file multiple times is very resource intensive.
    – gagneet
    Nov 20 at 22:09










  • there are 3 <string> tags from which we are not able to distinguish the id that you are mentioning.
    – stack0114106
    Nov 21 at 4:03










  • @stack0114106 the userID field can come up anywhere in the file, and as I have mentioned, there is no particular pattern that I have been able to use to search, except use the string from the CSV file to find the string in the XML file.
    – gagneet
    Nov 21 at 4:36













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I have a dump from our application server which contains XML of multiple strings. I am interested in the userID, which is embedded in the XML tags and in the format of (lasfir1) as in the XML examples below:



<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>lasfir1</string>
</row>

<row>
<integer>1536</integer>
<string>lasfir1</string>
<integer>235</integer>
<string>USER</string>
</row>

<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>lisfar1</string>
</row>


The requirement is to convert the string "lasfir1" only into its equivalent Email ID, which are available in another CSV (text) file which has key->value pairing of the userID and Email ID:



FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1


The XML may not always be the same, but the string will be the one to search for, not the pattern of what is ahead or behind it.



Is there some simple way to read the key->value pair (in the CSV file), check if the key (userID) exists in the XML file and then replace it with the 'value' (Email ID)



This is required for a set of 300+ userID and Email ID combinations, all of which might not be in the XML.










share|improve this question















I have a dump from our application server which contains XML of multiple strings. I am interested in the userID, which is embedded in the XML tags and in the format of (lasfir1) as in the XML examples below:



<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>lasfir1</string>
</row>

<row>
<integer>1536</integer>
<string>lasfir1</string>
<integer>235</integer>
<string>USER</string>
</row>

<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>lisfar1</string>
</row>


The requirement is to convert the string "lasfir1" only into its equivalent Email ID, which are available in another CSV (text) file which has key->value pairing of the userID and Email ID:



FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1


The XML may not always be the same, but the string will be the one to search for, not the pattern of what is ahead or behind it.



Is there some simple way to read the key->value pair (in the CSV file), check if the key (userID) exists in the XML file and then replace it with the 'value' (Email ID)



This is required for a set of 300+ userID and Email ID combinations, all of which might not be in the XML.







python regex xml perl csv






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 at 22:09

























asked Nov 20 at 5:41









gagneet

12.2k256194




12.2k256194












  • Are you looking for a programming language specific answer ? Please state the details of language, framework you are using for rest of the application?
    – Rohit Nandi
    Nov 20 at 7:18










  • I am not sure if this could be programming language specific, as there might be different solutions in different langs. I have tried with sed and perl, but it is a tedious process, as reading each value from the CSV and then searching in a 17M line file multiple times is very resource intensive.
    – gagneet
    Nov 20 at 22:09










  • there are 3 <string> tags from which we are not able to distinguish the id that you are mentioning.
    – stack0114106
    Nov 21 at 4:03










  • @stack0114106 the userID field can come up anywhere in the file, and as I have mentioned, there is no particular pattern that I have been able to use to search, except use the string from the CSV file to find the string in the XML file.
    – gagneet
    Nov 21 at 4:36


















  • Are you looking for a programming language specific answer ? Please state the details of language, framework you are using for rest of the application?
    – Rohit Nandi
    Nov 20 at 7:18










  • I am not sure if this could be programming language specific, as there might be different solutions in different langs. I have tried with sed and perl, but it is a tedious process, as reading each value from the CSV and then searching in a 17M line file multiple times is very resource intensive.
    – gagneet
    Nov 20 at 22:09










  • there are 3 <string> tags from which we are not able to distinguish the id that you are mentioning.
    – stack0114106
    Nov 21 at 4:03










  • @stack0114106 the userID field can come up anywhere in the file, and as I have mentioned, there is no particular pattern that I have been able to use to search, except use the string from the CSV file to find the string in the XML file.
    – gagneet
    Nov 21 at 4:36
















Are you looking for a programming language specific answer ? Please state the details of language, framework you are using for rest of the application?
– Rohit Nandi
Nov 20 at 7:18




Are you looking for a programming language specific answer ? Please state the details of language, framework you are using for rest of the application?
– Rohit Nandi
Nov 20 at 7:18












I am not sure if this could be programming language specific, as there might be different solutions in different langs. I have tried with sed and perl, but it is a tedious process, as reading each value from the CSV and then searching in a 17M line file multiple times is very resource intensive.
– gagneet
Nov 20 at 22:09




I am not sure if this could be programming language specific, as there might be different solutions in different langs. I have tried with sed and perl, but it is a tedious process, as reading each value from the CSV and then searching in a 17M line file multiple times is very resource intensive.
– gagneet
Nov 20 at 22:09












there are 3 <string> tags from which we are not able to distinguish the id that you are mentioning.
– stack0114106
Nov 21 at 4:03




there are 3 <string> tags from which we are not able to distinguish the id that you are mentioning.
– stack0114106
Nov 21 at 4:03












@stack0114106 the userID field can come up anywhere in the file, and as I have mentioned, there is no particular pattern that I have been able to use to search, except use the string from the CSV file to find the string in the XML file.
– gagneet
Nov 21 at 4:36




@stack0114106 the userID field can come up anywhere in the file, and as I have mentioned, there is no particular pattern that I have been able to use to search, except use the string from the CSV file to find the string in the XML file.
– gagneet
Nov 21 at 4:36












2 Answers
2






active

oldest

votes

















up vote
1
down vote













Check out this Perl one liner solution:



$ cat gagneet.csv
FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1

$ cat gagneet.xml
<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

. . . .
. . . .

$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit } '
<row>
<string></string>
<integer>2177</integer>
<string>assignee =FirstName.LastName@abc.com </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>
<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>FirstName.LastName@abc.com</string>
</row>
<row>
<integer>1536</integer>
<string>FirstName.LastName@abc.com</string>
<integer>235</integer>
<string>USER</string>
</row>
<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>FarstName.ListName@abc.com</string>
</row>


If you want edit only between tags, then



$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/<string>${y}</string>/<string>$kv{$y}</string>/gm; } print "$1$xml$3n"; } exit } '





share|improve this answer























  • when I run this script, it gives me an error sometimes: "The specified path is invalid." The command is: perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat users.csv) ; $content=qx(cat entities_email.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;foreach $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit }, and both the files are available in the folder in which I am running the command. This is on a Windows10 command line.
    – gagneet
    Nov 27 at 22:14












  • can you share the sample xml
    – stack0114106
    Nov 28 at 3:31










  • it is a 1.7GB file. :-(
    – gagneet
    Nov 30 at 3:45


















up vote
0
down vote













Created a script using Python3, which takes in the input as the CSV and the XML file and outputs an XML file with the changes. The command is:



python xml_converter.py –csvfile file.csv –xmlfile file.xml –outfile output_file.xml


Not totally optimized as I would want it to be and running on a single thread, and assumption is that the files are utf-8 encoded.



usage: Replace username to user email of a given xml file
[-h] --csvfile CSVFILE --xmlfile XMLFILE --outfile OUTFILE

optional arguments:
-h, --help show this help message and exit
--csvfile CSVFILE csv file that provide user name and email pair
--xmlfile XMLFILE xml file that to be searched and replaced
--outfile OUTFILE output file name


The basic script is:



class XMLConvert:
def __init__(self, csv, xml, out):
self._csv = csv
self._xml = xml
self._out = out

self._kv_dict = self.prepare_kv_dict()

def prepare_kv_dict(self):
with open(self._csv, newline='', encoding='utf-8') as f:
reader = csv.reader(f)
result = dict()
for row in reader:
result[row[1]] = row[2]
return result

def convert(self):
with open(self._xml, 'r', encoding='utf-8') as f:
for line in f:
_line = self.convert_line(line)
yield _line

def convert_line(self, line):
# self._kv_dict = {'lasfir1': 'First.Name@abc.com'}
for k, v in self._kv_dict.items():
if k.lower() in line:
# print(line)
return re.sub(r'{}'.format(k), v, line)
return line

def start(self):
with open(self._out, 'w', encoding='utf-8') as f:
for line in self.convert():
f.write(line)


if __name__ == '__main__':
csv_file, xml_file, out_file = parse_args()
converter = XMLConvert(csv_file, xml_file, out_file)
converter.start()


I am trying to add threads and modify it accordingly to optimize the running of it. If anyone has a better way then please do inform.






share|improve this answer





















  • The script above just replaces the first instance of the "string", if there are more, then will need to modify it to run
    – gagneet
    Nov 30 at 3:44











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53386889%2freplace-multiple-strings-in-xml-using-a-key-value-pair-in-a-csv-file%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
1
down vote













Check out this Perl one liner solution:



$ cat gagneet.csv
FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1

$ cat gagneet.xml
<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

. . . .
. . . .

$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit } '
<row>
<string></string>
<integer>2177</integer>
<string>assignee =FirstName.LastName@abc.com </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>
<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>FirstName.LastName@abc.com</string>
</row>
<row>
<integer>1536</integer>
<string>FirstName.LastName@abc.com</string>
<integer>235</integer>
<string>USER</string>
</row>
<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>FarstName.ListName@abc.com</string>
</row>


If you want edit only between tags, then



$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/<string>${y}</string>/<string>$kv{$y}</string>/gm; } print "$1$xml$3n"; } exit } '





share|improve this answer























  • when I run this script, it gives me an error sometimes: "The specified path is invalid." The command is: perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat users.csv) ; $content=qx(cat entities_email.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;foreach $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit }, and both the files are available in the folder in which I am running the command. This is on a Windows10 command line.
    – gagneet
    Nov 27 at 22:14












  • can you share the sample xml
    – stack0114106
    Nov 28 at 3:31










  • it is a 1.7GB file. :-(
    – gagneet
    Nov 30 at 3:45















up vote
1
down vote













Check out this Perl one liner solution:



$ cat gagneet.csv
FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1

$ cat gagneet.xml
<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

. . . .
. . . .

$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit } '
<row>
<string></string>
<integer>2177</integer>
<string>assignee =FirstName.LastName@abc.com </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>
<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>FirstName.LastName@abc.com</string>
</row>
<row>
<integer>1536</integer>
<string>FirstName.LastName@abc.com</string>
<integer>235</integer>
<string>USER</string>
</row>
<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>FarstName.ListName@abc.com</string>
</row>


If you want edit only between tags, then



$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/<string>${y}</string>/<string>$kv{$y}</string>/gm; } print "$1$xml$3n"; } exit } '





share|improve this answer























  • when I run this script, it gives me an error sometimes: "The specified path is invalid." The command is: perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat users.csv) ; $content=qx(cat entities_email.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;foreach $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit }, and both the files are available in the folder in which I am running the command. This is on a Windows10 command line.
    – gagneet
    Nov 27 at 22:14












  • can you share the sample xml
    – stack0114106
    Nov 28 at 3:31










  • it is a 1.7GB file. :-(
    – gagneet
    Nov 30 at 3:45













up vote
1
down vote










up vote
1
down vote









Check out this Perl one liner solution:



$ cat gagneet.csv
FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1

$ cat gagneet.xml
<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

. . . .
. . . .

$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit } '
<row>
<string></string>
<integer>2177</integer>
<string>assignee =FirstName.LastName@abc.com </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>
<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>FirstName.LastName@abc.com</string>
</row>
<row>
<integer>1536</integer>
<string>FirstName.LastName@abc.com</string>
<integer>235</integer>
<string>USER</string>
</row>
<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>FarstName.ListName@abc.com</string>
</row>


If you want edit only between tags, then



$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/<string>${y}</string>/<string>$kv{$y}</string>/gm; } print "$1$xml$3n"; } exit } '





share|improve this answer














Check out this Perl one liner solution:



$ cat gagneet.csv
FirstName.LastName@abc.com,lasfir1
FarstName.ListName@abc.com,lisfar1
LastName.FirstName@abc.com,firlas1

$ cat gagneet.xml
<row>
<string></string>
<integer>2177</integer>
<string>assignee =lasfir1 </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>

. . . .
. . . .

$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit } '
<row>
<string></string>
<integer>2177</integer>
<string>assignee =FirstName.LastName@abc.com </string>
<string>Firstname Lastname</string>
<integer>10</integer>
<string xsi:nil="true"/>
<integer>450</integer>
</row>
<row>
<string>#ffd600</string>
<integer>2199</integer>
<integer>23</integer>
<integer>474</integer>
<string>assignee</string>
<string>FirstName.LastName@abc.com</string>
</row>
<row>
<integer>1536</integer>
<string>FirstName.LastName@abc.com</string>
<integer>235</integer>
<string>USER</string>
</row>
<row>
<string>#ffd610</string>
<integer>2200</integer>
<integer>25</integer>
<integer>464</integer>
<string>assignee</string>
<string>FarstName.ListName@abc.com</string>
</row>


If you want edit only between tags, then



$ perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat gagneet.csv) ; $content=qx(cat gagneet.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;forea
ch $y (keys %kv) { $xml=~s/<string>${y}</string>/<string>$kv{$y}</string>/gm; } print "$1$xml$3n"; } exit } '






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 21 at 6:08

























answered Nov 21 at 5:57









stack0114106

1,6521416




1,6521416












  • when I run this script, it gives me an error sometimes: "The specified path is invalid." The command is: perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat users.csv) ; $content=qx(cat entities_email.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;foreach $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit }, and both the files are available in the folder in which I am running the command. This is on a Windows10 command line.
    – gagneet
    Nov 27 at 22:14












  • can you share the sample xml
    – stack0114106
    Nov 28 at 3:31










  • it is a 1.7GB file. :-(
    – gagneet
    Nov 30 at 3:45


















  • when I run this script, it gives me an error sometimes: "The specified path is invalid." The command is: perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat users.csv) ; $content=qx(cat entities_email.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;foreach $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit }, and both the files are available in the folder in which I am running the command. This is on a Windows10 command line.
    – gagneet
    Nov 27 at 22:14












  • can you share the sample xml
    – stack0114106
    Nov 28 at 3:31










  • it is a 1.7GB file. :-(
    – gagneet
    Nov 30 at 3:45
















when I run this script, it gives me an error sometimes: "The specified path is invalid." The command is: perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat users.csv) ; $content=qx(cat entities_email.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;foreach $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit }, and both the files are available in the folder in which I am running the command. This is on a Windows10 command line.
– gagneet
Nov 27 at 22:14






when I run this script, it gives me an error sometimes: "The specified path is invalid." The command is: perl -ne 'BEGIN { %kv=map{chomp;(split(",",$_))[1,0] } qx(cat users.csv) ; $content=qx(cat entities_email.xml);while($content=~/(<row>)(.*?)(</row>)/smg) { $xml=$2;foreach $y (keys %kv) { $xml=~s/${y}/$kv{$y}/gm; } print "$1$xml$3n"; } exit }, and both the files are available in the folder in which I am running the command. This is on a Windows10 command line.
– gagneet
Nov 27 at 22:14














can you share the sample xml
– stack0114106
Nov 28 at 3:31




can you share the sample xml
– stack0114106
Nov 28 at 3:31












it is a 1.7GB file. :-(
– gagneet
Nov 30 at 3:45




it is a 1.7GB file. :-(
– gagneet
Nov 30 at 3:45












up vote
0
down vote













Created a script using Python3, which takes in the input as the CSV and the XML file and outputs an XML file with the changes. The command is:



python xml_converter.py –csvfile file.csv –xmlfile file.xml –outfile output_file.xml


Not totally optimized as I would want it to be and running on a single thread, and assumption is that the files are utf-8 encoded.



usage: Replace username to user email of a given xml file
[-h] --csvfile CSVFILE --xmlfile XMLFILE --outfile OUTFILE

optional arguments:
-h, --help show this help message and exit
--csvfile CSVFILE csv file that provide user name and email pair
--xmlfile XMLFILE xml file that to be searched and replaced
--outfile OUTFILE output file name


The basic script is:



class XMLConvert:
def __init__(self, csv, xml, out):
self._csv = csv
self._xml = xml
self._out = out

self._kv_dict = self.prepare_kv_dict()

def prepare_kv_dict(self):
with open(self._csv, newline='', encoding='utf-8') as f:
reader = csv.reader(f)
result = dict()
for row in reader:
result[row[1]] = row[2]
return result

def convert(self):
with open(self._xml, 'r', encoding='utf-8') as f:
for line in f:
_line = self.convert_line(line)
yield _line

def convert_line(self, line):
# self._kv_dict = {'lasfir1': 'First.Name@abc.com'}
for k, v in self._kv_dict.items():
if k.lower() in line:
# print(line)
return re.sub(r'{}'.format(k), v, line)
return line

def start(self):
with open(self._out, 'w', encoding='utf-8') as f:
for line in self.convert():
f.write(line)


if __name__ == '__main__':
csv_file, xml_file, out_file = parse_args()
converter = XMLConvert(csv_file, xml_file, out_file)
converter.start()


I am trying to add threads and modify it accordingly to optimize the running of it. If anyone has a better way then please do inform.






share|improve this answer





















  • The script above just replaces the first instance of the "string", if there are more, then will need to modify it to run
    – gagneet
    Nov 30 at 3:44















up vote
0
down vote













Created a script using Python3, which takes in the input as the CSV and the XML file and outputs an XML file with the changes. The command is:



python xml_converter.py –csvfile file.csv –xmlfile file.xml –outfile output_file.xml


Not totally optimized as I would want it to be and running on a single thread, and assumption is that the files are utf-8 encoded.



usage: Replace username to user email of a given xml file
[-h] --csvfile CSVFILE --xmlfile XMLFILE --outfile OUTFILE

optional arguments:
-h, --help show this help message and exit
--csvfile CSVFILE csv file that provide user name and email pair
--xmlfile XMLFILE xml file that to be searched and replaced
--outfile OUTFILE output file name


The basic script is:



class XMLConvert:
def __init__(self, csv, xml, out):
self._csv = csv
self._xml = xml
self._out = out

self._kv_dict = self.prepare_kv_dict()

def prepare_kv_dict(self):
with open(self._csv, newline='', encoding='utf-8') as f:
reader = csv.reader(f)
result = dict()
for row in reader:
result[row[1]] = row[2]
return result

def convert(self):
with open(self._xml, 'r', encoding='utf-8') as f:
for line in f:
_line = self.convert_line(line)
yield _line

def convert_line(self, line):
# self._kv_dict = {'lasfir1': 'First.Name@abc.com'}
for k, v in self._kv_dict.items():
if k.lower() in line:
# print(line)
return re.sub(r'{}'.format(k), v, line)
return line

def start(self):
with open(self._out, 'w', encoding='utf-8') as f:
for line in self.convert():
f.write(line)


if __name__ == '__main__':
csv_file, xml_file, out_file = parse_args()
converter = XMLConvert(csv_file, xml_file, out_file)
converter.start()


I am trying to add threads and modify it accordingly to optimize the running of it. If anyone has a better way then please do inform.






share|improve this answer





















  • The script above just replaces the first instance of the "string", if there are more, then will need to modify it to run
    – gagneet
    Nov 30 at 3:44













up vote
0
down vote










up vote
0
down vote









Created a script using Python3, which takes in the input as the CSV and the XML file and outputs an XML file with the changes. The command is:



python xml_converter.py –csvfile file.csv –xmlfile file.xml –outfile output_file.xml


Not totally optimized as I would want it to be and running on a single thread, and assumption is that the files are utf-8 encoded.



usage: Replace username to user email of a given xml file
[-h] --csvfile CSVFILE --xmlfile XMLFILE --outfile OUTFILE

optional arguments:
-h, --help show this help message and exit
--csvfile CSVFILE csv file that provide user name and email pair
--xmlfile XMLFILE xml file that to be searched and replaced
--outfile OUTFILE output file name


The basic script is:



class XMLConvert:
def __init__(self, csv, xml, out):
self._csv = csv
self._xml = xml
self._out = out

self._kv_dict = self.prepare_kv_dict()

def prepare_kv_dict(self):
with open(self._csv, newline='', encoding='utf-8') as f:
reader = csv.reader(f)
result = dict()
for row in reader:
result[row[1]] = row[2]
return result

def convert(self):
with open(self._xml, 'r', encoding='utf-8') as f:
for line in f:
_line = self.convert_line(line)
yield _line

def convert_line(self, line):
# self._kv_dict = {'lasfir1': 'First.Name@abc.com'}
for k, v in self._kv_dict.items():
if k.lower() in line:
# print(line)
return re.sub(r'{}'.format(k), v, line)
return line

def start(self):
with open(self._out, 'w', encoding='utf-8') as f:
for line in self.convert():
f.write(line)


if __name__ == '__main__':
csv_file, xml_file, out_file = parse_args()
converter = XMLConvert(csv_file, xml_file, out_file)
converter.start()


I am trying to add threads and modify it accordingly to optimize the running of it. If anyone has a better way then please do inform.






share|improve this answer












Created a script using Python3, which takes in the input as the CSV and the XML file and outputs an XML file with the changes. The command is:



python xml_converter.py –csvfile file.csv –xmlfile file.xml –outfile output_file.xml


Not totally optimized as I would want it to be and running on a single thread, and assumption is that the files are utf-8 encoded.



usage: Replace username to user email of a given xml file
[-h] --csvfile CSVFILE --xmlfile XMLFILE --outfile OUTFILE

optional arguments:
-h, --help show this help message and exit
--csvfile CSVFILE csv file that provide user name and email pair
--xmlfile XMLFILE xml file that to be searched and replaced
--outfile OUTFILE output file name


The basic script is:



class XMLConvert:
def __init__(self, csv, xml, out):
self._csv = csv
self._xml = xml
self._out = out

self._kv_dict = self.prepare_kv_dict()

def prepare_kv_dict(self):
with open(self._csv, newline='', encoding='utf-8') as f:
reader = csv.reader(f)
result = dict()
for row in reader:
result[row[1]] = row[2]
return result

def convert(self):
with open(self._xml, 'r', encoding='utf-8') as f:
for line in f:
_line = self.convert_line(line)
yield _line

def convert_line(self, line):
# self._kv_dict = {'lasfir1': 'First.Name@abc.com'}
for k, v in self._kv_dict.items():
if k.lower() in line:
# print(line)
return re.sub(r'{}'.format(k), v, line)
return line

def start(self):
with open(self._out, 'w', encoding='utf-8') as f:
for line in self.convert():
f.write(line)


if __name__ == '__main__':
csv_file, xml_file, out_file = parse_args()
converter = XMLConvert(csv_file, xml_file, out_file)
converter.start()


I am trying to add threads and modify it accordingly to optimize the running of it. If anyone has a better way then please do inform.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 21 at 4:33









gagneet

12.2k256194




12.2k256194












  • The script above just replaces the first instance of the "string", if there are more, then will need to modify it to run
    – gagneet
    Nov 30 at 3:44


















  • The script above just replaces the first instance of the "string", if there are more, then will need to modify it to run
    – gagneet
    Nov 30 at 3:44
















The script above just replaces the first instance of the "string", if there are more, then will need to modify it to run
– gagneet
Nov 30 at 3:44




The script above just replaces the first instance of the "string", if there are more, then will need to modify it to run
– gagneet
Nov 30 at 3:44


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53386889%2freplace-multiple-strings-in-xml-using-a-key-value-pair-in-a-csv-file%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

404 Error Contact Form 7 ajax form submitting

How to know if a Active Directory user can login interactively

Refactoring coordinates for Minecraft Pi buildings written in Python