Describir: Generic decoding of seen and imagined objects using hierarchical visual features