Cost and Area and Speed
Cache are made of SRAM which uses 6T to hold a single bit. More data means that more transistors which adds on to area and cost.
Moreover with more cells, it will take longer to search through the cache to find the relevant set and also more comparators (if set-associative)