0

I have a list of pdf filenames that need to be parsed and ultimately sent to a sql table, with the parse out pieces each in their own column. How would I split based on a dash '-' and ultimately get it into a table.

What cmdlets would you start with to split on a character? I need to split based on the dash '-'.

Thanks for the help.

Example File Names:

  • tester-2458-full_contact_snapshot-20200115_1188.pdf
  • tester-2458-limited_contact_snapshot-20200119_9330.pdf

Desired Results:

enter image description here

1

3 Answers 3

1

There is also a -split operator.

https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_split

basic example:

if you have file names in $FilePaths array.

foreach($filepath in $FilePaths)
{
  $parts = $filepath -split '-'; 
  [pscustomobject]@{"User" = $parts[0]; "AppID" = $parts[1]; "FileType" = $parts[2]; "FilePath"=$filepath }
}
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks @Nasir, this is helpful and does split it as desired. How would I get the -split results into a column format. I piped it to format-table, but it doesn't change the result.
A basic version: $parts = $filepath -split '-'; [pscustomobject]@{"User" = $parts[0]; "AppID" = $parts[1]; "FileType" = $parts[2]; "FilePath"=$filepath}
1

Use $variable.split('-') which will return a string array with a length equal to however many elements are produced by the split operation.

1 Comment

Nasir and LordPupazz answers successfully split the filename
1

yet another way is to use regex & named capture groups. [grin]

what it does ...

  • creates a set of file name strings to work with
    when ready to use real data, remove the entire #region/#endregion block and use either (Get-ChildItem).Name or another method that gives you plain strings.
  • iterates thru the collection of file name strings
  • uses $Null = to suppress the False/True output of the -match call
  • does a regex match with named capture groups
  • uses the $Match automatic variable to plug the captured values into the desired properties of a [PSCustomObject]
  • sends that PSCO out to the $Results collection
  • displays that on screen
  • sends it to a CSV for later use

the code ...

#region >>> fake reading in a list of file names
#    in real life, use (Get-ChildItem).Name
$InStuff = @'
tester-2458-full_contact_snapshot-20200115_1188.pdf
tester-2458-limited_contact_snapshot-20200119_9330.pdf
'@ -split [System.Environment]::NewLine
#endregion >>> fake reading in a list of file names

$Results = foreach ($IS_Item in $InStuff)
    {
    $Null = $IS_Item -match '^(?<User>.+)-(?<AppId>.+)-(?<FileType>.+)-(?<Date>.+)\.pdf$'
    [PSCustomObject]@{
        User = $Matches.User
        AppId = $Matches.AppId
        FileType = $Matches.FileType
        Date = $Matches.Date
        FileName = $IS_Item
        }
    }

# display on screen    
$Results

# send to CSV file
$Results |
    Export-Csv -LiteralPath "$env:TEMP\JM1_-_FileReport.csv" -NoTypeInformation

output to screen ...

User     : tester
AppId    : 2458
FileType : full_contact_snapshot
Date     : 20200115_1188
FileName : tester-2458-full_contact_snapshot-20200115_1188.pdf

User     : tester
AppId    : 2458
FileType : limited_contact_snapshot
Date     : 20200119_9330
FileName : tester-2458-limited_contact_snapshot-20200119_9330.pdf

content of the C:\Temp\JM1_-_FileReport.csv file ...

"User","AppId","FileType","Date","FileName"
"tester","2458","full_contact_snapshot","20200115_1188","tester-2458-full_contact_snapshot-20200115_1188.pdf"
"tester","2458","limited_contact_snapshot","20200119_9330","tester-2458-limited_contact_snapshot-20200119_9330.pdf"

6 Comments

For anyone trying to understand regex, there are some helpful tools I learned about from this article: powershellexplained.com/… in the section 'Regex Resources'. The one that's helped me the most so far is regex101.com.
@JM1 - i use regex101.com rather often ... it is a truly lovely resource! [grin]
Thanks @Lee_Dailey for your response. I'm trying to understand it before I ask you anything further. I'm still not getting what the question mark and <User> are doing in this bit: (?<User>.+) .
When adding the real code, I added a Format-Table to the results to ultimately get to my goal. The other answers did split the path, but this most fully answered the need to ultimately put the results into a table in the overall request. Thanks @Lee_Dailey and all who responded.
@JM1 - the (?<NameOfThing>.=) is how you do a named capture group. the ?<> part is specially for that purpose. ///// please be VERY careful of the Format-* cmdlets. they destroy your objects, wrap the butchered bits in formatting code, and then send that out. they are for FINAL output to screen or a plain text file. if you need to use the data later, then stick with `Select-Object.///// you are very welcome! i'm glad to have helped a bit! [grin]
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.