使用 VB.NET,我收到一个对象列表。我需要从这个对象列表中清除重复的数据集,以便列表基于所有列都是唯一的。在 StackOverflow 上阅读,一般的想法是使用GroupBy
后跟Select
,但是每当我尝试这样做时,仍然会返回整个集合。
对于这个给定的例子,我期望uniqueData中的计数为3,但它仍然是4。dataSet1和dataSet2应该被视为彼此的重复
Module Program
Sub Main(args As String())
Dim data As New PersonCollection With {.PersonCollection = New List(Of PersonInfo)}
Dim dataSet1 As New PersonInfo With {.FirstName = "Bob", .LastName = "Smith", .Rating = 10}
Dim dataSet2 As New PersonInfo With {.FirstName = "John", .LastName = "Hurt", .Rating = 20}
Dim dataSet3 As New PersonInfo With {.FirstName = "Bob", .LastName = "Smith", .Rating = 30}
Dim dataSet4 As New PersonInfo With {.FirstName = "Bob", .LastName = "Smith", .Rating = 10}
data.PersonCollection.Add(dataSet1)
data.PersonCollection.Add(dataSet2)
data.PersonCollection.Add(dataSet3)
data.PersonCollection.Add(dataSet4)
Dim uniqueData = data.PersonCollection.GroupBy(Function(x) New With {x.FirstName, x.LastName, x.Rating}).Select(Function(x) x.First).ToList()
Console.ReadLine()
End Sub
Private Class PersonCollection
Property PersonCollection As List(Of PersonInfo)
End Class
Private Class PersonInfo
Property FirstName As String
Property LastName As String
Property Rating As Integer
End Class
End Module
它适用于元组:
但最好使用
Comparer
:尝试使用元组,因为它具有值语义,而匿名类类型则具有引用语义。也就是说,两个对象是通过引用进行比较的,而不是通过它们的字段或属性。
请注意,该语法
New With {x.FirstName ... }
确实适用于 O/R 映射器,因为它会被转换为 SQL,并且值和引用语义之间的区别不再适用。另一种解决方案是将匿名类型的属性声明为Key Properties。
但元组就简单多了。
DistinctBy
我们可以通过使用而不是GroupBy
后跟来稍微简化查询Select
: