PowerShell Script to Find Duplicate Files and Delete Them


Okay first of all… hahah j/k
 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2
    Computer type
    PC/Desktop
    Manufacturer/Model
    Intel NUC12WSHi7
    CPU
    12th Gen Intel Core i7-1260P, 2100 MHz
    Motherboard
    NUC12WSBi7
    Memory
    64 GB
    Graphics Card(s)
    Intel Iris Xe
    Sound Card
    built-in Realtek HD audio
    Monitor(s) Displays
    Dell U3219Q
    Screen Resolution
    3840x2160 @ 60Hz
    Hard Drives
    Samsung SSD 990 PRO 1TB
    Keyboard
    CODE 104-Key Mechanical with Cherry MX Clears
    Antivirus
    Microsoft Defender
  • Operating System
    Linux Mint 21.2 (Cinnamon)
    Computer type
    PC/Desktop
    Manufacturer/Model
    Intel NUC8i5BEH
    CPU
    Intel Core i5-8259U CPU @ 2.30GHz
    Memory
    32 GB
    Graphics card(s)
    Iris Plus 655
    Keyboard
    CODE 104-Key Mechanical with Cherry MX Clears
I want to point out a few "rookie mistakes", if you're new to PS scripting.

1. "Test-Path $path" will return $true, even if $path is really a normal file (and not a folder). "Test-Path -Type Container $path" is a more specific way to check if the passed argument is a folder.

2. Your list of displayed files doesn't use the full or a relative pathname. This is a major problem if I copy the same file to multiple subfolders. Pass the $_.FullName object reference from Get-ChildItem, and store that in your array list.

3. Another way to re-organize the script's workflow is to expand on your original idea of grouping the files by a shared hash value. We do one pass to recursively collect the combined hash and filename values into an array. After grouping them by unique hash, and counting each group's membership size, we can determine which groups have member counts > 1.

If we use use the [System.Collections.ArrayList] type to cast the hash list variable, the array can be dynamically trimmed (deleted). Without the type cast, PS will complain that the created array is fixed in size, and cannot be modified.

4. We can take the master list of hash values, and loop through multiple processing passes. Every time we select and delete a duplicated file, the hash list gets trimmed by removing the selected filename out of the list. We can continue looping, until the check reveals no individual hash group has a membership larger than a single member (they are all unique).

If the user wants to quit at any point, we allow them to specify that on the input line.

5. At a certain scale, when there are too many duplicated files or hash groups, the script will be less useful because there will be too much text to display. Then you will have to learn how to use WinForms, or some graphical UI, so the user can scroll through a long list of selections.

PS's Out-GridView cmdlet is very useful, but only if the user has the PowerShell ISE app installed (because that's where the cmdlet lives). Most users will have it available but beware some Windows debloaters will remove ISE, which deprives you of Out-GridView.
 

My Computer

System One

  • OS
    Windows 7
$fd = Get-ChildItem –path E:\Portable\ -Recurse| Group-Object -property Length| Where-Object { $_.count -gt 1 }| Select-Object –Expand Group| Get-FileHash | Group-Object -property hash | Where-Object { $_.count -gt 1 }| ForEach-Object { $_.group | Select-Object Path, Hash }

2.webp

$fd | Out-GridView -Title "Select files to delete" –PassThru | Remove-Item –Verbose –WhatIf

1.webp


You can also replace duplicate files with hard links.
 

My Computer

System One

  • OS
    Microsoft Windows 11 Home
    Computer type
    PC/Desktop
    Manufacturer/Model
    MSI MS-7D98
    CPU
    Intel Core i5-13490F
    Motherboard
    MSI B760 GAMING PLUS WIFI
    Memory
    2 x 16 Patriot Memory (PDP Systems) PSD516G560081
    Graphics Card(s)
    GIGABYTE GeForce RTX 4070 WINDFORCE OC 12G (GV-N4070WF3OC-12GD)
    Sound Card
    Bluetooth Аудио
    Monitor(s) Displays
    INNOCN 15K1F
    Screen Resolution
    1920 x 1080
    Hard Drives
    WD_BLACK SN770 250GB
    KINGSTON SNV2S1000G (ELFK0S.6)
    PSU
    Thermaltake Toughpower GF3 1000W
    Case
    CG560 - DeepCool
    Cooling
    ID-COOLING SE-224-XTS / 2 x 140Mm Fan - rear and top; 3 x 120Mm - front
    Keyboard
    Corsair K70 RGB TKL
    Mouse
    Corsair KATAR PRO XT
    Internet Speed
    100 Mbps
    Browser
    Firefox
    Antivirus
    Microsoft Defender Antivirus
    Other Info
    https://www.userbenchmark.com/UserRun/66553205
The three 'duplicates' that FB's script found in my Desktop folder were unique text files with only a few words.
According to File Explore and using a DIR command, they were all zero bytes. But they are all different and should not have had
identical hashes.
 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
The three 'duplicates' that FB's script found in my Desktop folder were unique text files with only a few words.
According to File Explore and using a DIR command, they were all zero bytes. But they are all different and should not have had
identical hashes.
Zero-length files still have a computed MD5 value.

Code:
PS C:\Users\GARLIN\Downloads> dir .\BOOT_TEST\ZERO

    Directory: C:\Users\GARLIN\Downloads\BOOT_TEST

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----        1/24/2025  11:42 PM              0 ZERO

PS C:\Users\GARLIN\Downloads> Get-FileHash -Path .\BOOT_TEST\ZERO -Algorithm MD5
Algorithm       Hash                                                                   Path                                                   
---------       ----                                                                   ----                                                                                                                                                        
MD5             D41D8CD98F00B204E9800998ECF8427E                                       C:\Users\GARLIN\Downloads\BOOT_TEST\ZER
Code:
PS C:\Users\GARLIN\Downloads> .\FRIDAY.ps1
Please Enter a Directory Path to Scan: C:\Users\GARLIN\Downloads\BOOT_TEST
MD5: D41D8CD98F00B204E9800998ECF8427E
[1] "C:\Users\GARLIN\Downloads\BOOT_TEST\ZERO"
[2] "C:\Users\GARLIN\Downloads\BOOT_TEST\SUB\SUB2\ZERO"

Pick one of the files to delete, 'q' to quit:
 

My Computer

System One

  • OS
    Windows 7
All three giles produce identical hashes! Why is this?

I used this script, generated by ChatGPT. It explains why FB's script does not work.

# Define the file path
$filePath = "C:\Users\Admin\Desktop\Satellite dish.txt"

# Check if the file exists
if (Test-Path $filePath) {
# Compute the hash of the file
$hash = Get-FileHash -Path $filePath -Algorithm SHA256

# Display the hash
Write-Output "File: $filePath"
Write-Output "Hash (SHA256): $($hash.Hash)"
} else {
# Display an error message if the file does not exist
Write-Output "Error: File not found at path $filePath"
}
 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
Ooops, I see FB used MD5 and ChatGPT used SHA256. Will try again.

MD5 also returns identical hashes for three non-identical, but small txt files.
 
Last edited:

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
Ooops, I see FB used MD5 and ChatGPT used SHA256. Will try again.

MD5 also returns identical hashes for three non-identical, but small txt files.
The SHA256 hashing algorithm, slower than MD5 hashing algorithm, that is why i use the MD5 also my script works i don't how you are coming up with these results my script tested by many programmers not by just you no one find the script as not working, so problem is you.
 

My Computer

System One

  • OS
    Windows 11
    Computer type
    PC/Desktop
    Manufacturer/Model
    HP Pavilion
    CPU
    AMD Ryzen 7 5700G
    Motherboard
    Erica6
    Memory
    Micron Technology DDR4-3200 16GB
    Graphics Card(s)
    NVIDIA GeForce RTX 3060
    Sound Card
    Realtek ALC671
    Monitor(s) Displays
    Samsung SyncMaster U28E590
    Screen Resolution
    3840 x 2160
    Hard Drives
    SAMSUNG MZVLQ1T0HALB-000H1
@FreeBooter

You are teaching yourself PS and I am trying to help by pointing out that your script is infallible, no need to be rude.
 
Last edited:

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
@FreeBooter

You are teaching yourself PS and I am trying to helpp by pointing out that your script is infallible, no need to be rude.
You are the rude person by telling me that my script i have tested multiple times does not work and you are not programmer, how are you helping me. This not your first you did same with my other scripts all does not work for you for some reason also i don't need your help to learn i have two Discord servers full of programmers if i need help i will ask them.
 

My Computer

System One

  • OS
    Windows 11
    Computer type
    PC/Desktop
    Manufacturer/Model
    HP Pavilion
    CPU
    AMD Ryzen 7 5700G
    Motherboard
    Erica6
    Memory
    Micron Technology DDR4-3200 16GB
    Graphics Card(s)
    NVIDIA GeForce RTX 3060
    Sound Card
    Realtek ALC671
    Monitor(s) Displays
    Samsung SyncMaster U28E590
    Screen Resolution
    3840 x 2160
    Hard Drives
    SAMSUNG MZVLQ1T0HALB-000H1
I copied these files and the new one have a size of 70 to 200 bytes now where they were zero size before.
 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
Do you know how my script works?
 

My Computer

System One

  • OS
    Windows 11
    Computer type
    PC/Desktop
    Manufacturer/Model
    HP Pavilion
    CPU
    AMD Ryzen 7 5700G
    Motherboard
    Erica6
    Memory
    Micron Technology DDR4-3200 16GB
    Graphics Card(s)
    NVIDIA GeForce RTX 3060
    Sound Card
    Realtek ALC671
    Monitor(s) Displays
    Samsung SyncMaster U28E590
    Screen Resolution
    3840 x 2160
    Hard Drives
    SAMSUNG MZVLQ1T0HALB-000H1
Do you know how my script works?
It does not work properly, as I have informed you. Let's leave it. It is not my intention to upset you. Bye.
 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
It does not work properly, as I have informed you. Let's leave it. It is not my intention to upset you. Bye.
It works for me and others not for you, all my scripts does not work for you.
 

My Computer

System One

  • OS
    Windows 11
    Computer type
    PC/Desktop
    Manufacturer/Model
    HP Pavilion
    CPU
    AMD Ryzen 7 5700G
    Motherboard
    Erica6
    Memory
    Micron Technology DDR4-3200 16GB
    Graphics Card(s)
    NVIDIA GeForce RTX 3060
    Sound Card
    Realtek ALC671
    Monitor(s) Displays
    Samsung SyncMaster U28E590
    Screen Resolution
    3840 x 2160
    Hard Drives
    SAMSUNG MZVLQ1T0HALB-000H1
Agreed, I should have said the scripts do not work for me. The three duplicates that were actually different seem to be corrupt or the content I could see was in an alternate data stream. I apologise for saying your scripts don't work.
 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
Some further notes about the method of hashing everything and comparing or grouping hashes later...

If you have these files, for example:

Code:
Name                                    Length
----                                    ------
nuclear plans from around the world.wim 10,780,988,855
Sonnets About Travis Kelce.txt          2,394
Taylor Swift lyrics in Esperanto.txt    2,394

Using the "hash everything" method, on an internal SSD, takes me about 19.8 seconds, because it's hashing that big WIM file. There is no reason to hash this file, because looking at its size, it cannot possibly be a duplicate of the other files. We only need to compare the two text files. Doing that takes 0.013 seconds.

Using a real-world example, I have a folder of 36,977 files, 80.9 GB in size, on an external spinny disk attached via USB 3. Using the "hash everything" method takes just over 15 minutes to run through this folder, and that's using MD5 as the hashing algorithm.

Using the method I outlined in #21 takes less than 2 seconds. Using @abactuon's method in #43, after I fixed it, reports similar times. Both of our methods default to SHA-2 algorithms, SHA-256 specifically. So, we are hashing more slowly, but it matters little because we're hashing only when needed.
 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2
    Computer type
    PC/Desktop
    Manufacturer/Model
    Intel NUC12WSHi7
    CPU
    12th Gen Intel Core i7-1260P, 2100 MHz
    Motherboard
    NUC12WSBi7
    Memory
    64 GB
    Graphics Card(s)
    Intel Iris Xe
    Sound Card
    built-in Realtek HD audio
    Monitor(s) Displays
    Dell U3219Q
    Screen Resolution
    3840x2160 @ 60Hz
    Hard Drives
    Samsung SSD 990 PRO 1TB
    Keyboard
    CODE 104-Key Mechanical with Cherry MX Clears
    Antivirus
    Microsoft Defender
  • Operating System
    Linux Mint 21.2 (Cinnamon)
    Computer type
    PC/Desktop
    Manufacturer/Model
    Intel NUC8i5BEH
    CPU
    Intel Core i5-8259U CPU @ 2.30GHz
    Memory
    32 GB
    Graphics card(s)
    Iris Plus 655
    Keyboard
    CODE 104-Key Mechanical with Cherry MX Clears
Another consideration is there can be hard or symbolic links, which are duplicate files by design. You can use Get-ChildItem's LinkType & Target fields to check if they need to be excluded.

Code:
$FilesList = [System.Collections.ArrayList]@()

Get-ChildItem $path -Recurse -File | select FullName,LinkType,Target | foreach {
    $File = $_
    $FullName = $File.FullName

    switch($_.LinkType) {
        HardLink {
            if ($FilesList -notcontains $File.Target) {
                $FilesList += $FullName
            }
            else {
                Write-Host "Skipping Hard Link: `"$FullName`""
            }
        }
        SymbolicLink {
            Write-Host "Skipping Symbolic Link: `"$FullName`""
        }
        default {
            $FilesList += $FullName
        }
    }
}
 

My Computer

System One

  • OS
    Windows 7
I started a new thread to avoid hijacking this one. Can anyone explain why Notepad was caching three files elsewhere?

 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
Another consideration is there can be hard or symbolic links, which are duplicate files by design. You can use Get-ChildItem's LinkType & Target fields to check if they need to be excluded.

Code:
$FilesList = [System.Collections.ArrayList]@()

Get-ChildItem $path -Recurse -File | select FullName,LinkType,Target | foreach {
    $File = $_
    $FullName = $File.FullName

    switch($_.LinkType) {
        HardLink {
            if ($FilesList -notcontains $File.Target) {
                $FilesList += $FullName
            }
            else {
                Write-Host "Skipping Hard Link: `"$FullName`""
            }
        }
        SymbolicLink {
            Write-Host "Skipping Symbolic Link: `"$FullName`""
        }
        default {
            $FilesList += $FullName
        }
    }
}
do I have to edit that script to add the file name and path?
 

My Computers

System One System Two

  • OS
    Windows 11 Pro 24H2 26100.2894
    Computer type
    Laptop
    Manufacturer/Model
    Acer Swift SF114-34
    CPU
    Pentium Silver N6000 1.10GHz
    Memory
    4GB
    Screen Resolution
    1920 x 1080
    Hard Drives
    SSD
    Cooling
    fanless
    Internet Speed
    150 Mbps
    Browser
    Brave
    Antivirus
    Webroot Secure Anywhere
    Other Info
    System 3

    ASUS T100TA Transformer
    Processor Intel Atom Z3740 @ 1.33GHz
    Installed RAM 2.00 GB (1.89 GB usable)
    System type 32-bit operating system, x64-based processor

    Edition Windows 10 Home
    Version 22H2 build 19045.3570
  • Operating System
    Windows 11 Pro 23H2 22631.2506
    Computer type
    Laptop
    Manufacturer/Model
    HP Mini 210-1090NR PC (bought in late 2009!)
    CPU
    Atom N450 1.66GHz
    Memory
    2GB
    Browser
    Brave
    Antivirus
    Webroot
do I have to edit that script to add the file name and path?
Here's my updated version.
Code:
while (1) {
    $Path = Read-Host "Please Enter a Directory Path to Scan"

    if (-not (Test-Path -Type Container $Path)) {
        Write-Host "Invalid directory path, please try again.`n"
    }
    else {
        break
    }
}

# Create array as type [System.Collections.ArrayList], so we can delete items from the list.

$FilesList = [System.Collections.ArrayList]@()

Get-ChildItem $Path -Recurse -File | select FullName,LinkType,Target | foreach {
    $File = $_
    $FullName = $File.FullName

    switch($_.LinkType) {
        HardLink {
            if ($FilesList -notcontains $File.Target) {
                $FilesList += $FullName
            }
            else {
                Write-Host "Skipping Hard Link: `"$FullName`""
                $Skipped = $true
            }
        }
        SymbolicLink {
            Write-Host "Skipping Symbolic Link: `"$FullName`""
            $Skipped = $true
        }
        default {
            $FilesList += $FullName
        }
    }
}

if ($Skipped) {
    Write-Host ""
}

$HashList = [System.Collections.ArrayList]@(
    $FilesList | foreach {
        Get-FileHash -LiteralPath $_ -Algorithm MD5 | select Hash,Path
    }
)

if (($HashList | Group-Object -Property Hash | Where-Object { $_.Count -gt 1 }).Count -eq 0) {
    Write-Host "No duplicate files found."
    exit 0
}

while (1) {
    $FilenameList = @{}
    $Index = 1

    foreach ($Hash in ($HashList | Group-Object -Property Hash | Where-Object { $_.Count -gt 1 })) {
        Write-Host "MD5: $($Hash.Name)"
        foreach ($File in $Hash.Group.Path) {
            Write-Host "[$Index] `"$File`""

            #  Build list of duplicated files, in numbered order
            $FilenameList[$Index] = $File
            $Index++
        }
        Write-Host ""
    }

    $Selection = Read-Host "Pick one of the files to delete, 'q' to quit"

    if ($Selection -match 'q') {
        break
    }
    else {
        # Recast $Selection as integer to avoid problems later
        $Selection = [int]$Selection
    }

    if ($Selection -lt 1 -or $Selection -ge $Index) {
        Write-Host "$Selection is not valid, or out of range"
    }
    else {
        $DeletedFile = $FilenameList[$Selection]
        Write-Host "Deleting `"$DeletedFile`"`n"
        Remove-Item $DeletedFile -Force

        # Remove matching file from $HashList & $FilenameList arrays
        $HashList = ($HashList | where { $_.Path -notmatch [regex]::Escape($DeletedFile) })
        $FilenameList.Remove($Selection)
    }

    if ($FilenameList.Count -eq 1) {
        break
    }
}
 
Last edited:

My Computer

System One

  • OS
    Windows 7

Latest Support Threads

Back
Top Bottom