Issue
I'm using Python3, Linux Mint and Visual Studio Code.
I have some code that reads a directory and prints some xml files like so:
persistence_security_dcshadow_4742.xml
Network_Service_Guest_added_to_admins_4732.xml
spoolsample_5145.xml
LM_Remote_Service02_7045.xml
DE_RDP_Tunneling_4624.xml
I'm trying to figure out how to write so that only the integers remain after I have run this read script, i.e., removing all text with only numbers remaining. I tried using regular expression with use of the import re
module but didn't have much luck.
Solution
Use regex with [0-9]
import re
regex = r'[0-9]+'
xmls = [
'persistence_security_dcshadow_4742.xml',
'Network_Service_Guest_added_to_admins_4732.xml',
'spoolsample_5145.xml',
'LM_Remote_Service02_7045.xml',
'DE_RDP_Tunneling_4624.xml',
]
for xml in xmls:
matches = re.findall(regex, xml)
number = matches[-1]
print(number)
> 4742
> 4732
> 5145
> 7045
> 4624
UPDATE
If you want to print the numbers only after all the files have been read, then you can create a function that takes a list of xml files and returns the corresponding number for each file
import re
def xmls_to_numbers(xmls):
regex = r'[0-9]+'
numbers = [ ]
for xml in xmls:
matches = re.findall(regex, xml)
number = matches[-1]
numbers.append(number)
return numbers
xmls = [
'persistence_security_dcshadow_4742.xml',
'Network_Service_Guest_added_to_admins_4732.xml',
'spoolsample_5145.xml',
'LM_Remote_Service02_7045.xml',
'DE_RDP_Tunneling_4624.xml',
]
print(xmls_to_numbers(xmls))
> ['4742', '4732', '5145', '7045', '4624']
Answered By - Amine Messaoudi Answer Checked By - Mary Flores (WPSolving Volunteer)