Popis: |
Proteini so molekule, ki so za živa bitja izjemnega pomena. Njihova naloga je, da sodelujejo pri bioloških procesih in da organizmom dajejo obliko, povzročijo pa lahko tudi različna bolezenska stanja - v teh primerih želimo vplivati na njihovo delovanje. S sodobnimi računalniškimi metodami lahko posamezne proteine natančno preučimo in na nekatera vprašanja odgovorimo brez zapletenih in dragih laboratorijskih postopkov. V tem delu se posvetimo iskanju lokacij na proteinih, na katere se lahko vežejo manjše molekule - ligandi. Pomembna motivacija tovrstnega raziskovanja je odkrivanje novih zdravil, saj lahko vezava liganda proteinu spremeni funkcionalnost, s tem pa zaustavi njegove negativne učinke na organizem. Za določanje veznih mest na proteinih uporabimo trirazsežno konvolucijsko nevronsko mrežo, ki upošteva strukturo proteina. Za množico podatkov izberemo proteine iz podatkovnih baz PDBbind in sc-PDB. Pri pripravi podatkov podobnim proteinom združimo informacije o vseh znanih ligandih in ne le izbranih, s čimer posplošimo metode že obstoječih raziskav. Rezultate napovednega modela analiziramo z različnimi metrikami in ugotovimo, da je za 54% večjih in farmakološko pomembnejših veznih mest napovedano središče od resničnega oddaljeno manj kot 4Å. Proteins are molecules of great importance for all living beings. Their role is to partake in biological processes and provide organisms with their shape, but at times they are also the cause of health disorders - in these cases we aim to be able to manipulate their activity. With modern computer based methods we can study proteins in detail and answer difficult questons without complex and expensive laboratory procedures. In this work we focus on determining protein locations to which small molecules - ligands - bind. Since ligand binding can alter protein function and consequently stop its negative effect on the organism, important motivation in this type of research is novel drug discovery. To predict protein binding sites we use a three-dimensional convolutional neural network which takes the spatial structure of a protein into account. For our dataset we choose proteins from the PDBbind and sc-PDB databases. In the cases of similar proteins we combine available ligand information using all the possible ligands and not only a special sub-selection, as a way to generalize the work of existing research. We analyze our model's performance through various metrics to notice that for 54% of larger and pharmacologically more relevant binding sites the distance between their real and predicted centers amounts to less than 4Å. |