Table of contents
  1. Story
  2. Interview Innovation International
    1. Key Presenters
      1. Dr Ruben Kok
      2. Dr George Strawn
      3. Professor Dr Barend Mons
      4. Dr Niklas Blomberg
      5. Dr René van Schaik
    2. Could you briefly explain the main topic of the Data Fairport conference?
    3. Why do you think this issue of ‘big data’ in the life sciences has become so prominent of late?
    4. What do you see as the main challenges ahead? Is it the technology or are other aspects just as important?
    5. Can you each give your personal vision for the Data Fairport in the long run?
    6. What do you hope to have achieved at the end of this week, coming out of this conference?
    7. The Data Fairport Conference
      1. AIM
      2. APPROACH
      3. SCOPE
    8. KEY DRIVERS
    9. Is this concept unique to life sciences? Do you think it is a model that could be rolled out for other disciplines?
    10. INTELLIGENCE
      1. THE DATA FAIRPORT OBJECTIVES
      2. CONFERENCE PARTICIPANTS
      3. CONTACT
  3. Jointly designing a DATA FAIRPORT
    1. Executive summary
      1. List of unconference Participants
    2. 0 Introduction
      1. List of unconference Participants
    3. 1 Vision, Mission & Scope
      1. Figure 1 The hourglass used for discussions at the meeting
    4. 2 Stakeholder Needs Requirements
      1. Taking the perspective of various stakeholders the following needs can be articulated
    5. 3 Solutions & Technology
    6. 4 Access & Funding Models
      1. The following potential elements of the funding model were discussed
    7. 5 Governance, Organisation & Operations
    8. 6 The next steps: a preparatory phase of 9 months
    9. Appendix I – Elevator Pitches
      1. Elevator Pitch 1
      2. Elevator Pitch 2
      3. Elevator Pitch 3
      4. Elevator Pitch 4
      5. Elevator Pitch 5
    10. Appendix II – Initial list of DATA FAIRPORT use cases
    11. Appendix III – Initial list of DATA FAIRPORT detailed requirements
  4. Slides
    1. Slide 1. A curse of interdisciplinarity
    2. Slide 2. Dutch Techcentre For Life Science
    3. Slide 3. ELIXIR
    4. Slide 4. DISC: the connected data departments of DTL research Hotels
    5. Slide 5. What is bioinformatics?
    6. Slide 6. Bioinformatics underpins life-science research
    7. Slide 7. Life Science data: Multi-omics, multi-technology, multi organism, multi dimensional
    8. Slide 8. From molecules to medicine
    9. Slide 9. What is ELIXIR?
    10. Slide 10. Why ELIXIR?
    11. Slide 11. The challenge
    12. Slide 12. Europe has already paid for the science
    13. Slide 13. ELIXIR’s mission
    14. Slide 14. A distributed pan-European infrastructure
    15. Slide15. Benefits
    16. Slide 16. The scientific reason for ELIXIR
    17. Slide 17. Commentary
    18. Slide 18. One societal reason for ELIXIR
    19. Slide 19. The financial reason for ELIXIR
    20. Slide 20. Maintaining open access
    21. Slide 21. 13 ELIXIR Countries
    22. Slide 22. Part two >>>> eScience in LS
    23. Slide 23. The Data Deluge
    24. Slide 24. Nanopublications & Cardinal Assertions
    25. Slide 25. Under the hood……
    26. Slide 26. Managing volume & complexity
    27. Slide 27. The LS concept web: 2x2x106 concepts (profiles)
    28. Slide 28. Genes related to Cystic fibrosis
    29. Slide 29. Scatter  Plot
    30. Slide 30. Network Graph
    31. Slide 31. eScience…. in silico reasoning and in cerebro validation
    32. Slide 32. Organisation of the ecosystem
    33. Slide 33. ORCID VIVO
    34. Slide 34. IN ANY CASE
    35. Slide 35. Acceptance of Semantic Web Approach
    36. Slide 36. Acknowledging
  5. Spotfire Dashboard
  6. Research Notes
    1. Data FairPort Files Excerpts
    2. Scientific reproducibility and big data
  7. Systematic identification of pharmacogenomics information from clinical trials
    1. Author information
    2. Abstract
    3. Published by Elsevier Inc.
    4. Free PMC Article
    5. Images from this publication
    6. Publication Types, MeSH Terms, Grant Support
      1. Publication Types
      2. MeSH Terms
      3. Grant Support
    7. LinkOut - more resources
      1. Full Text Sources
      2. Medical
  8. NIH Public Access
    1. Abstract
    2. 1. Introduction
      1. Figure 1
      2. Figure 2
      3. Figure 3
    3. 2. Related work
      1. 2.1. Curated gene–drug–disease relationships in PharmGKB
      2. 2.2. Text mining techniques for extracting PGx concepts and relationships
      3. 2.3. Other text-mining applications to clinical trial records
    4. 3. Methods
      1. Figure 4
      2. 3.1. Preprocessing clinical trial records
      3. 3.2. Extracting gene–drug–disease relationships
      4. 3.3. Indexing clinical trials
      5. 3.4. Hypothesis testing and method evaluation
    5. 4. Results
      1. 4.1. Comparative evaluation of ClinicalTrials.gov
        1. Figure 5
      2. 4.2. Assessment of our automatic approach
        1. Table 1
    6. 5. Discussion
      1. 5.1. Coverage
      2. 5.2. Time lag
      3. 5.3. Practical implications of this research
      4. 5.4. Limitations of our approach and future work
    7. 6. Conclusions
    8. Acknowledgments
    9. Abbreviations
    10. Footnotes
    11. References
      1. 1
      2. 2
      3. 3
      4. 4
      5. 5
      6. 6
      7. 7
      8. 8
      9. 9
      10. 10
      11. 11
      12. 12
      13. 13
      14. 14
      15. 15
      16. 16
      17. 17
      18. 18
      19. 19
      20. 20
      21. 21
      22. 22
      23. 23
      24. 24
      25. 25
      26. 26
      27. 27
      28. 28
      29. 29
      30. 30
      31. 31
      32. 32
      33. 33
      34. 34
      35. 35
      36. 36
      37. 37
      38. 38
      39. 39
      40. 40
      41. 41
      42. 42
      43. 43
      44. 44
      45. 45
  9. Euretos
    1. Fact Sheet
      1. Late stage lead attrition in biotech & pharma R&D
      2. BRAIN[Ξ] – addressing lead attrition
      3. BRAIN[Ξ] – The Euretos ‘Bio Relations and Intelligence Network’
      4. Get answers ‘now’!
      5. Unique value
      6. Benefits
      7. Contact
      8. Appendix 1 – Selection of relevant publications
    2. BRAIN Sample Output
    3. BRAIN
    4. ​​Eurotos
      1. Albert Mons - CEO
      2. Marco Wanders - Head of Sales 
      3. Onno Becker Hof - CTO 
      4. Aram Krol - Head of Product Development
      5. Arie Baak - Head of Market Development 
    5. Contact
    6. Login
  10. Biosemantics
  11. Immune activation and collateral damage in AIDS pathogenesis
    1. CHRONIC IMMUNE ACTIVATION IS THE PRIMARY DRIVER IN HIV PATHOGENESIS
      1. Box 1 Damage control in non-pathogenic SIV infection
    2. CAUSES OF IMMUNE ACTIVATION IN  HIV INFECTION
      1. BREACH OF GASTRO-INTESTINAL IMMUNITY
      2. SINGLE-STRANDED RNA, TOLL-LIKE RECEPTORS, AND TYPE IIFN PRODUCTION
        1. Figure 1 Pathways of chronic immune activation and its down-stream effects in HIV infection
    3. PATHOGENIC EFFECTS OF IMMUNE ACTIVATION AND INFLAMMATION
      1. INFLAMMATION DRIVES CD4+ T-CELL DEPLETION AND LOSS OF HIV-SPECIFIC IMMUNITY
      2. HIV-INDUCED INFLAMMATION AND HIV-ASSOCIATED NON-AIDS DISEASE
    4. HIV IN COMPARISON TO OTHER PERSISTENT VIRAL INFECTIONS
    5. THE IMMUNEACTIVATIONHYPOTHESISREDUCEDTO PRACTICE
      1. BOOSTING IMMUNITY
      2. THERAPEUTIC DAMAGE CONTROL
    6. CONCLUSION
    7. ACKNOWLEDGMENTS
    8. REFERENCES
      1. 1
      2. 2
      3. 3
      4. 4
      5. 5
      6. 6
      7. 7
      8. 8
      9. 9
      10. 10
      11. 11
      12. 12
      13. 13
      14. 14
      15. 15
      16. 16
      17. 17
      18. 18
      19. 19
      20. 20
      21. 21
      22. 22
      23. 23
      24. 24
      25. 25
      26. 26
      27. 27
      28. 28
      29. 29
      30. 30
      31. 31
      32. 32
      33. 33
      34. 34
      35. 35
      36. 36
      37. 37
      38. 38
      39. 39
      40. 40
      41. 41
      42. 42
      43. 43
      44. 44
      45. 45
      46. 46
      47. 47
      48. 48
      49. 49
      50. 50
      51. 51
      52. 52
      53. 53
      54. 54
      55. 55
      56. 56
      57. 57
      58. 58
      59. 59
      60. 60
      61. 61
      62. 62
      63. 63
      64. 64
      65. 65
      66. 66
      67. 67
      68. 68
      69. 69
      70. 70
      71. 71
      72. 72
      73. 73
      74. 74
      75. 75
      76. 76
      77. 77
      78. 78
      79. 79
      80. 80
      81. 81
      82. 82
      83. 83
      84. 84
      85. 85
      86. 86
      87. 87
      88. 88
      89. 89
      90. 90
      91. 91
      92. 92
      93. 93
      94. 94
      95. 95
      96. 96
      97. 97
      98. 98
      99. 99
      100. 100
      101. 101
      102. 102
      103. 103
      104. 104
      105. 105
      106. 106
      107. 107
      108. 108
      109. 109
      110. 110
      111. 111
      112. 112
      113. 113
      114. 114
      115. 115
      116. 116
      117. 117
      118. 118
      119. 119
      120. 120
      121. 121
      122. 122
      123. 123
      124. 124
      125. 125
      126. 126
      127. 127
      128. 128
      129. 129
      130. 130
      131. 131
      132. 132
      133. 133
      134. 134
      135. 135
      136. 136
      137. 137
      138. 138
      139. 139
      140. 140
      141. 141
      142. 142
      143. 143
      144. 144
      145. 145
      146. 146
      147. 147
      148. 148
      149. 149
      150. 150
      151. 151
      152. 152
      153. 153
      154. 154
      155. 155
      156. 156
      157. 157
      158. 158
      159. 159
      160. 160
      161. 161
      162. 162
      163. 163
      164. 164
      165. 165
      166. 166
      167. 167
      168. 168
      169. 169
      170. 170
      171. 171
      172. 172
      173. 173
      174. 174
      175. 175
      176. 176
      177. 177
      178. 178
      179. 179
      180. 180
      181. 181
      182. 182
      183. 183
      184. 184
      185. 185
      186. 186
      187. 187
      188. 188
      189. 189
      190. 190
      191. 191
      192. 192
      193. 193
      194. 194
      195. 195
      196. 196
      197. 197
      198. 198
      199. 199
      200. 200
      201. 201
      202. 202
      203. 203
      204. 204
      205. 205
      206. 206
      207. 207
      208. 208
      209. 209
      210. 210
      211. 211
      212. 212
      213. 213
      214. 214
      215. 215
      216. 216
      217. 217
      218. 218
      219. 219
      220. 220
      221. 221
    9. Other
  12. NEXT

Euretos BRAIN

Last modified
Table of contents
  1. Story
  2. Interview Innovation International
    1. Key Presenters
      1. Dr Ruben Kok
      2. Dr George Strawn
      3. Professor Dr Barend Mons
      4. Dr Niklas Blomberg
      5. Dr René van Schaik
    2. Could you briefly explain the main topic of the Data Fairport conference?
    3. Why do you think this issue of ‘big data’ in the life sciences has become so prominent of late?
    4. What do you see as the main challenges ahead? Is it the technology or are other aspects just as important?
    5. Can you each give your personal vision for the Data Fairport in the long run?
    6. What do you hope to have achieved at the end of this week, coming out of this conference?
    7. The Data Fairport Conference
      1. AIM
      2. APPROACH
      3. SCOPE
    8. KEY DRIVERS
    9. Is this concept unique to life sciences? Do you think it is a model that could be rolled out for other disciplines?
    10. INTELLIGENCE
      1. THE DATA FAIRPORT OBJECTIVES
      2. CONFERENCE PARTICIPANTS
      3. CONTACT
  3. Jointly designing a DATA FAIRPORT
    1. Executive summary
      1. List of unconference Participants
    2. 0 Introduction
      1. List of unconference Participants
    3. 1 Vision, Mission & Scope
      1. Figure 1 The hourglass used for discussions at the meeting
    4. 2 Stakeholder Needs Requirements
      1. Taking the perspective of various stakeholders the following needs can be articulated
    5. 3 Solutions & Technology
    6. 4 Access & Funding Models
      1. The following potential elements of the funding model were discussed
    7. 5 Governance, Organisation & Operations
    8. 6 The next steps: a preparatory phase of 9 months
    9. Appendix I – Elevator Pitches
      1. Elevator Pitch 1
      2. Elevator Pitch 2
      3. Elevator Pitch 3
      4. Elevator Pitch 4
      5. Elevator Pitch 5
    10. Appendix II – Initial list of DATA FAIRPORT use cases
    11. Appendix III – Initial list of DATA FAIRPORT detailed requirements
  4. Slides
    1. Slide 1. A curse of interdisciplinarity
    2. Slide 2. Dutch Techcentre For Life Science
    3. Slide 3. ELIXIR
    4. Slide 4. DISC: the connected data departments of DTL research Hotels
    5. Slide 5. What is bioinformatics?
    6. Slide 6. Bioinformatics underpins life-science research
    7. Slide 7. Life Science data: Multi-omics, multi-technology, multi organism, multi dimensional
    8. Slide 8. From molecules to medicine
    9. Slide 9. What is ELIXIR?
    10. Slide 10. Why ELIXIR?
    11. Slide 11. The challenge
    12. Slide 12. Europe has already paid for the science
    13. Slide 13. ELIXIR’s mission
    14. Slide 14. A distributed pan-European infrastructure
    15. Slide15. Benefits
    16. Slide 16. The scientific reason for ELIXIR
    17. Slide 17. Commentary
    18. Slide 18. One societal reason for ELIXIR
    19. Slide 19. The financial reason for ELIXIR
    20. Slide 20. Maintaining open access
    21. Slide 21. 13 ELIXIR Countries
    22. Slide 22. Part two >>>> eScience in LS
    23. Slide 23. The Data Deluge
    24. Slide 24. Nanopublications & Cardinal Assertions
    25. Slide 25. Under the hood……
    26. Slide 26. Managing volume & complexity
    27. Slide 27. The LS concept web: 2x2x106 concepts (profiles)
    28. Slide 28. Genes related to Cystic fibrosis
    29. Slide 29. Scatter  Plot
    30. Slide 30. Network Graph
    31. Slide 31. eScience…. in silico reasoning and in cerebro validation
    32. Slide 32. Organisation of the ecosystem
    33. Slide 33. ORCID VIVO
    34. Slide 34. IN ANY CASE
    35. Slide 35. Acceptance of Semantic Web Approach
    36. Slide 36. Acknowledging
  5. Spotfire Dashboard
  6. Research Notes
    1. Data FairPort Files Excerpts
    2. Scientific reproducibility and big data
  7. Systematic identification of pharmacogenomics information from clinical trials
    1. Author information
    2. Abstract
    3. Published by Elsevier Inc.
    4. Free PMC Article
    5. Images from this publication
    6. Publication Types, MeSH Terms, Grant Support
      1. Publication Types
      2. MeSH Terms
      3. Grant Support
    7. LinkOut - more resources
      1. Full Text Sources
      2. Medical
  8. NIH Public Access
    1. Abstract
    2. 1. Introduction
      1. Figure 1
      2. Figure 2
      3. Figure 3
    3. 2. Related work
      1. 2.1. Curated gene–drug–disease relationships in PharmGKB
      2. 2.2. Text mining techniques for extracting PGx concepts and relationships
      3. 2.3. Other text-mining applications to clinical trial records
    4. 3. Methods
      1. Figure 4
      2. 3.1. Preprocessing clinical trial records
      3. 3.2. Extracting gene–drug–disease relationships
      4. 3.3. Indexing clinical trials
      5. 3.4. Hypothesis testing and method evaluation
    5. 4. Results
      1. 4.1. Comparative evaluation of ClinicalTrials.gov
        1. Figure 5
      2. 4.2. Assessment of our automatic approach
        1. Table 1
    6. 5. Discussion
      1. 5.1. Coverage
      2. 5.2. Time lag
      3. 5.3. Practical implications of this research
      4. 5.4. Limitations of our approach and future work
    7. 6. Conclusions
    8. Acknowledgments
    9. Abbreviations
    10. Footnotes
    11. References
      1. 1
      2. 2
      3. 3
      4. 4
      5. 5
      6. 6
      7. 7
      8. 8
      9. 9
      10. 10
      11. 11
      12. 12
      13. 13
      14. 14
      15. 15
      16. 16
      17. 17
      18. 18
      19. 19
      20. 20
      21. 21
      22. 22
      23. 23
      24. 24
      25. 25
      26. 26
      27. 27
      28. 28
      29. 29
      30. 30
      31. 31
      32. 32
      33. 33
      34. 34
      35. 35
      36. 36
      37. 37
      38. 38
      39. 39
      40. 40
      41. 41
      42. 42
      43. 43
      44. 44
      45. 45
  9. Euretos
    1. Fact Sheet
      1. Late stage lead attrition in biotech & pharma R&D
      2. BRAIN[Ξ] – addressing lead attrition
      3. BRAIN[Ξ] – The Euretos ‘Bio Relations and Intelligence Network’
      4. Get answers ‘now’!
      5. Unique value
      6. Benefits
      7. Contact
      8. Appendix 1 – Selection of relevant publications
    2. BRAIN Sample Output
    3. BRAIN
    4. ​​Eurotos
      1. Albert Mons - CEO
      2. Marco Wanders - Head of Sales 
      3. Onno Becker Hof - CTO 
      4. Aram Krol - Head of Product Development
      5. Arie Baak - Head of Market Development 
    5. Contact
    6. Login
  10. Biosemantics
  11. Immune activation and collateral damage in AIDS pathogenesis
    1. CHRONIC IMMUNE ACTIVATION IS THE PRIMARY DRIVER IN HIV PATHOGENESIS
      1. Box 1 Damage control in non-pathogenic SIV infection
    2. CAUSES OF IMMUNE ACTIVATION IN  HIV INFECTION
      1. BREACH OF GASTRO-INTESTINAL IMMUNITY
      2. SINGLE-STRANDED RNA, TOLL-LIKE RECEPTORS, AND TYPE IIFN PRODUCTION
        1. Figure 1 Pathways of chronic immune activation and its down-stream effects in HIV infection
    3. PATHOGENIC EFFECTS OF IMMUNE ACTIVATION AND INFLAMMATION
      1. INFLAMMATION DRIVES CD4+ T-CELL DEPLETION AND LOSS OF HIV-SPECIFIC IMMUNITY
      2. HIV-INDUCED INFLAMMATION AND HIV-ASSOCIATED NON-AIDS DISEASE
    4. HIV IN COMPARISON TO OTHER PERSISTENT VIRAL INFECTIONS
    5. THE IMMUNEACTIVATIONHYPOTHESISREDUCEDTO PRACTICE
      1. BOOSTING IMMUNITY
      2. THERAPEUTIC DAMAGE CONTROL
    6. CONCLUSION
    7. ACKNOWLEDGMENTS
    8. REFERENCES
      1. 1
      2. 2
      3. 3
      4. 4
      5. 5
      6. 6
      7. 7
      8. 8
      9. 9
      10. 10
      11. 11
      12. 12
      13. 13
      14. 14
      15. 15
      16. 16
      17. 17
      18. 18
      19. 19
      20. 20
      21. 21
      22. 22
      23. 23
      24. 24
      25. 25
      26. 26
      27. 27
      28. 28
      29. 29
      30. 30
      31. 31
      32. 32
      33. 33
      34. 34
      35. 35
      36. 36
      37. 37
      38. 38
      39. 39
      40. 40
      41. 41
      42. 42
      43. 43
      44. 44
      45. 45
      46. 46
      47. 47
      48. 48
      49. 49
      50. 50
      51. 51
      52. 52
      53. 53
      54. 54
      55. 55
      56. 56
      57. 57
      58. 58
      59. 59
      60. 60
      61. 61
      62. 62
      63. 63
      64. 64
      65. 65
      66. 66
      67. 67
      68. 68
      69. 69
      70. 70
      71. 71
      72. 72
      73. 73
      74. 74
      75. 75
      76. 76
      77. 77
      78. 78
      79. 79
      80. 80
      81. 81
      82. 82
      83. 83
      84. 84
      85. 85
      86. 86
      87. 87
      88. 88
      89. 89
      90. 90
      91. 91
      92. 92
      93. 93
      94. 94
      95. 95
      96. 96
      97. 97
      98. 98
      99. 99
      100. 100
      101. 101
      102. 102
      103. 103
      104. 104
      105. 105
      106. 106
      107. 107
      108. 108
      109. 109
      110. 110
      111. 111
      112. 112
      113. 113
      114. 114
      115. 115
      116. 116
      117. 117
      118. 118
      119. 119
      120. 120
      121. 121
      122. 122
      123. 123
      124. 124
      125. 125
      126. 126
      127. 127
      128. 128
      129. 129
      130. 130
      131. 131
      132. 132
      133. 133
      134. 134
      135. 135
      136. 136
      137. 137
      138. 138
      139. 139
      140. 140
      141. 141
      142. 142
      143. 143
      144. 144
      145. 145
      146. 146
      147. 147
      148. 148
      149. 149
      150. 150
      151. 151
      152. 152
      153. 153
      154. 154
      155. 155
      156. 156
      157. 157
      158. 158
      159. 159
      160. 160
      161. 161
      162. 162
      163. 163
      164. 164
      165. 165
      166. 166
      167. 167
      168. 168
      169. 169
      170. 170
      171. 171
      172. 172
      173. 173
      174. 174
      175. 175
      176. 176
      177. 177
      178. 178
      179. 179
      180. 180
      181. 181
      182. 182
      183. 183
      184. 184
      185. 185
      186. 186
      187. 187
      188. 188
      189. 189
      190. 190
      191. 191
      192. 192
      193. 193
      194. 194
      195. 195
      196. 196
      197. 197
      198. 198
      199. 199
      200. 200
      201. 201
      202. 202
      203. 203
      204. 204
      205. 205
      206. 206
      207. 207
      208. 208
      209. 209
      210. 210
      211. 211
      212. 212
      213. 213
      214. 214
      215. 215
      216. 216
      217. 217
      218. 218
      219. 219
      220. 220
      221. 221
    9. Other
  12. NEXT

  1. Story
  2. Interview Innovation International
    1. Key Presenters
      1. Dr Ruben Kok
      2. Dr George Strawn
      3. Professor Dr Barend Mons
      4. Dr Niklas Blomberg
      5. Dr René van Schaik
    2. Could you briefly explain the main topic of the Data Fairport conference?
    3. Why do you think this issue of ‘big data’ in the life sciences has become so prominent of late?
    4. What do you see as the main challenges ahead? Is it the technology or are other aspects just as important?
    5. Can you each give your personal vision for the Data Fairport in the long run?
    6. What do you hope to have achieved at the end of this week, coming out of this conference?
    7. The Data Fairport Conference
      1. AIM
      2. APPROACH
      3. SCOPE
    8. KEY DRIVERS
    9. Is this concept unique to life sciences? Do you think it is a model that could be rolled out for other disciplines?
    10. INTELLIGENCE
      1. THE DATA FAIRPORT OBJECTIVES
      2. CONFERENCE PARTICIPANTS
      3. CONTACT
  3. Jointly designing a DATA FAIRPORT
    1. Executive summary
      1. List of unconference Participants
    2. 0 Introduction
      1. List of unconference Participants
    3. 1 Vision, Mission & Scope
      1. Figure 1 The hourglass used for discussions at the meeting
    4. 2 Stakeholder Needs Requirements
      1. Taking the perspective of various stakeholders the following needs can be articulated
    5. 3 Solutions & Technology
    6. 4 Access & Funding Models
      1. The following potential elements of the funding model were discussed
    7. 5 Governance, Organisation & Operations
    8. 6 The next steps: a preparatory phase of 9 months
    9. Appendix I – Elevator Pitches
      1. Elevator Pitch 1
      2. Elevator Pitch 2
      3. Elevator Pitch 3
      4. Elevator Pitch 4
      5. Elevator Pitch 5
    10. Appendix II – Initial list of DATA FAIRPORT use cases
    11. Appendix III – Initial list of DATA FAIRPORT detailed requirements
  4. Slides
    1. Slide 1. A curse of interdisciplinarity
    2. Slide 2. Dutch Techcentre For Life Science
    3. Slide 3. ELIXIR
    4. Slide 4. DISC: the connected data departments of DTL research Hotels
    5. Slide 5. What is bioinformatics?
    6. Slide 6. Bioinformatics underpins life-science research
    7. Slide 7. Life Science data: Multi-omics, multi-technology, multi organism, multi dimensional
    8. Slide 8. From molecules to medicine
    9. Slide 9. What is ELIXIR?
    10. Slide 10. Why ELIXIR?
    11. Slide 11. The challenge
    12. Slide 12. Europe has already paid for the science
    13. Slide 13. ELIXIR’s mission
    14. Slide 14. A distributed pan-European infrastructure
    15. Slide15. Benefits
    16. Slide 16. The scientific reason for ELIXIR
    17. Slide 17. Commentary
    18. Slide 18. One societal reason for ELIXIR
    19. Slide 19. The financial reason for ELIXIR
    20. Slide 20. Maintaining open access
    21. Slide 21. 13 ELIXIR Countries
    22. Slide 22. Part two >>>> eScience in LS
    23. Slide 23. The Data Deluge
    24. Slide 24. Nanopublications & Cardinal Assertions
    25. Slide 25. Under the hood……
    26. Slide 26. Managing volume & complexity
    27. Slide 27. The LS concept web: 2x2x106 concepts (profiles)
    28. Slide 28. Genes related to Cystic fibrosis
    29. Slide 29. Scatter  Plot
    30. Slide 30. Network Graph
    31. Slide 31. eScience…. in silico reasoning and in cerebro validation
    32. Slide 32. Organisation of the ecosystem
    33. Slide 33. ORCID VIVO
    34. Slide 34. IN ANY CASE
    35. Slide 35. Acceptance of Semantic Web Approach
    36. Slide 36. Acknowledging
  5. Spotfire Dashboard
  6. Research Notes
    1. Data FairPort Files Excerpts
    2. Scientific reproducibility and big data
  7. Systematic identification of pharmacogenomics information from clinical trials
    1. Author information
    2. Abstract
    3. Published by Elsevier Inc.
    4. Free PMC Article
    5. Images from this publication
    6. Publication Types, MeSH Terms, Grant Support
      1. Publication Types
      2. MeSH Terms
      3. Grant Support
    7. LinkOut - more resources
      1. Full Text Sources
      2. Medical
  8. NIH Public Access
    1. Abstract
    2. 1. Introduction
      1. Figure 1
      2. Figure 2
      3. Figure 3
    3. 2. Related work
      1. 2.1. Curated gene–drug–disease relationships in PharmGKB
      2. 2.2. Text mining techniques for extracting PGx concepts and relationships
      3. 2.3. Other text-mining applications to clinical trial records
    4. 3. Methods
      1. Figure 4
      2. 3.1. Preprocessing clinical trial records
      3. 3.2. Extracting gene–drug–disease relationships
      4. 3.3. Indexing clinical trials
      5. 3.4. Hypothesis testing and method evaluation
    5. 4. Results
      1. 4.1. Comparative evaluation of ClinicalTrials.gov
        1. Figure 5
      2. 4.2. Assessment of our automatic approach
        1. Table 1
    6. 5. Discussion
      1. 5.1. Coverage
      2. 5.2. Time lag
      3. 5.3. Practical implications of this research
      4. 5.4. Limitations of our approach and future work
    7. 6. Conclusions
    8. Acknowledgments
    9. Abbreviations
    10. Footnotes
    11. References
      1. 1
      2. 2
      3. 3
      4. 4
      5. 5
      6. 6
      7. 7
      8. 8
      9. 9
      10. 10
      11. 11
      12. 12
      13. 13
      14. 14
      15. 15
      16. 16
      17. 17
      18. 18
      19. 19
      20. 20
      21. 21
      22. 22
      23. 23
      24. 24
      25. 25
      26. 26
      27. 27
      28. 28
      29. 29
      30. 30
      31. 31
      32. 32
      33. 33
      34. 34
      35. 35
      36. 36
      37. 37
      38. 38
      39. 39
      40. 40
      41. 41
      42. 42
      43. 43
      44. 44
      45. 45
  9. Euretos
    1. Fact Sheet
      1. Late stage lead attrition in biotech & pharma R&D
      2. BRAIN[Ξ] – addressing lead attrition
      3. BRAIN[Ξ] – The Euretos ‘Bio Relations and Intelligence Network’
      4. Get answers ‘now’!
      5. Unique value
      6. Benefits
      7. Contact
      8. Appendix 1 – Selection of relevant publications
    2. BRAIN Sample Output
    3. BRAIN
    4. ​​Eurotos
      1. Albert Mons - CEO
      2. Marco Wanders - Head of Sales 
      3. Onno Becker Hof - CTO 
      4. Aram Krol - Head of Product Development
      5. Arie Baak - Head of Market Development 
    5. Contact
    6. Login
  10. Biosemantics
  11. Immune activation and collateral damage in AIDS pathogenesis
    1. CHRONIC IMMUNE ACTIVATION IS THE PRIMARY DRIVER IN HIV PATHOGENESIS
      1. Box 1 Damage control in non-pathogenic SIV infection
    2. CAUSES OF IMMUNE ACTIVATION IN  HIV INFECTION
      1. BREACH OF GASTRO-INTESTINAL IMMUNITY
      2. SINGLE-STRANDED RNA, TOLL-LIKE RECEPTORS, AND TYPE IIFN PRODUCTION
        1. Figure 1 Pathways of chronic immune activation and its down-stream effects in HIV infection
    3. PATHOGENIC EFFECTS OF IMMUNE ACTIVATION AND INFLAMMATION
      1. INFLAMMATION DRIVES CD4+ T-CELL DEPLETION AND LOSS OF HIV-SPECIFIC IMMUNITY
      2. HIV-INDUCED INFLAMMATION AND HIV-ASSOCIATED NON-AIDS DISEASE
    4. HIV IN COMPARISON TO OTHER PERSISTENT VIRAL INFECTIONS
    5. THE IMMUNEACTIVATIONHYPOTHESISREDUCEDTO PRACTICE
      1. BOOSTING IMMUNITY
      2. THERAPEUTIC DAMAGE CONTROL
    6. CONCLUSION
    7. ACKNOWLEDGMENTS
    8. REFERENCES
      1. 1
      2. 2
      3. 3
      4. 4
      5. 5
      6. 6
      7. 7
      8. 8
      9. 9
      10. 10
      11. 11
      12. 12
      13. 13
      14. 14
      15. 15
      16. 16
      17. 17
      18. 18
      19. 19
      20. 20
      21. 21
      22. 22
      23. 23
      24. 24
      25. 25
      26. 26
      27. 27
      28. 28
      29. 29
      30. 30
      31. 31
      32. 32
      33. 33
      34. 34
      35. 35
      36. 36
      37. 37
      38. 38
      39. 39
      40. 40
      41. 41
      42. 42
      43. 43
      44. 44
      45. 45
      46. 46
      47. 47
      48. 48
      49. 49
      50. 50
      51. 51
      52. 52
      53. 53
      54. 54
      55. 55
      56. 56
      57. 57
      58. 58
      59. 59
      60. 60
      61. 61
      62. 62
      63. 63
      64. 64
      65. 65
      66. 66
      67. 67
      68. 68
      69. 69
      70. 70
      71. 71
      72. 72
      73. 73
      74. 74
      75. 75
      76. 76
      77. 77
      78. 78
      79. 79
      80. 80
      81. 81
      82. 82
      83. 83
      84. 84
      85. 85
      86. 86
      87. 87
      88. 88
      89. 89
      90. 90
      91. 91
      92. 92
      93. 93
      94. 94
      95. 95
      96. 96
      97. 97
      98. 98
      99. 99
      100. 100
      101. 101
      102. 102
      103. 103
      104. 104
      105. 105
      106. 106
      107. 107
      108. 108
      109. 109
      110. 110
      111. 111
      112. 112
      113. 113
      114. 114
      115. 115
      116. 116
      117. 117
      118. 118
      119. 119
      120. 120
      121. 121
      122. 122
      123. 123
      124. 124
      125. 125
      126. 126
      127. 127
      128. 128
      129. 129
      130. 130
      131. 131
      132. 132
      133. 133
      134. 134
      135. 135
      136. 136
      137. 137
      138. 138
      139. 139
      140. 140
      141. 141
      142. 142
      143. 143
      144. 144
      145. 145
      146. 146
      147. 147
      148. 148
      149. 149
      150. 150
      151. 151
      152. 152
      153. 153
      154. 154
      155. 155
      156. 156
      157. 157
      158. 158
      159. 159
      160. 160
      161. 161
      162. 162
      163. 163
      164. 164
      165. 165
      166. 166
      167. 167
      168. 168
      169. 169
      170. 170
      171. 171
      172. 172
      173. 173
      174. 174
      175. 175
      176. 176
      177. 177
      178. 178
      179. 179
      180. 180
      181. 181
      182. 182
      183. 183
      184. 184
      185. 185
      186. 186
      187. 187
      188. 188
      189. 189
      190. 190
      191. 191
      192. 192
      193. 193
      194. 194
      195. 195
      196. 196
      197. 197
      198. 198
      199. 199
      200. 200
      201. 201
      202. 202
      203. 203
      204. 204
      205. 205
      206. 206
      207. 207
      208. 208
      209. 209
      210. 210
      211. 211
      212. 212
      213. 213
      214. 214
      215. 215
      216. 216
      217. 217
      218. 218
      219. 219
      220. 220
      221. 221
    9. Other
  12. NEXT

Story

Joint NSF-NIH Biomedical Big Data Research

Helping you publish discover and reuse research data.pngFairport Convention on Blackboard.jpg

FAIRPORT keywords.pngWorkshop Data FAIRPORT, Leiden 16-01-2014.png

The "New Science of the Brain" (National Geographic: February 2014) reminded me that Dr. Barend Mons mentioned his BRAIN work that builds on Semantic Medline that our Semantic Data Science Team has been working on for the White House Big Data Initiative. So I asked if he could point me to more information on his work on this. Previously I viewed the 2010 VIVO Conference video.

His response was: Please find some more info here: http://www.euretos.com/brain

I am (independently) advising a company that build a lot on published text mining tools, ontologies, data models and reasoning algorithms developed in our LUMC and EMC groups.

Actually, BRAIN (i.e. EURETOS) has meanwhile incorporated significantly more data sources than just Semantic Medline (a.o. UniProt and ChEMBL). In fact they also did a lot of work to turn any redundancy in triples in Semantic Medline into what we call ‘cardinal assertions’ with the underpininig ‘nanopublications’ (triple-assertions with their provenance). This speeds up the reasoning process quite spectacularly. We were just awarded a substantial grant with 7 academic partners and 12 companies (among which Nature Genetics, YARCData and Elsevier) to take this approach to the next level. More information can be obtained from Albert Mons or Arie Baak.​

BarendMons10032012Slide24.PNG

To give you one flavour: I created in 4 minutes (!) the picture below in BRAIN which quite nicely ‘reproduces' man-months of reading 221 papers revealing the functional links between HIV disease Progression and Inflammation. ...........For each cardinal assertion in this network, all underlying nanopublications > sentences in PubMed are at one click. Note, this was before the paper was published….paper attached… If one pulls the two core concepts apart, the system tells you ‘no way.  there must be a relationship'.

BRAINSampleOutput.png

I understood via Nigam Shah that NLM (see example) is moving in the ‘prediction’ direction as well. 

We have some nice papers in the pipeline showing systematic ‘implicit’ predictions, and much more to come.

I used the Semantic medline part of our BRAIN tool to reproduce my earlier finding that Tegafur may work for Malaria. Briefly, BRAIN is in principle a Graph with unique (cardinal) assertions supported by minu=imally one record in PubMed, UniProt or ChEMBL (soon more).

One ‘Cardinal Assertion' that came up as part of the picture was [antineoplastic agents] [treat] [malaria, falciparum].

As I linked out to the PMID supposedly containing the supporting nanopublications and apparently (unless the BRAIN guys made a mistake) they came from PMID 9300459 and from 15565927.

In both, I can not find (neither in the text nor in the Mesh terms) the reference to the exact concept antineoplastic agents. So my question is: do you also include triples that are ‘inferred’ (as the triple makes sense to me) or only exact findings in the text. In the latter case: how can we explain this link? In the lower pannels you see the triples that (according to the data base were found by your system in those to PMIDs’

BRAIN1.png

END OF RESPONSE

On 13 - 16 January (2014), the Dutch ELIXIR Node, in cooperation with the Netherlands eScience Center and the Lorentz Center, organized an ‘unconference’ entitled: ‘Jointly designing a Data FAIRPORT’. During these four days key aspects of a global infrastructure for effective data publishing, discovery, sharing and re-use for eScience experimentation were discussed. The workshop brought together 25 high level participants representing leading research infrastructures and policy institutes, publishers, semantic web specialists, innovators, computer scientists and experimental (e)Scientists. Further information is available from the DTL website hereJointly designing a DATA FAIRPORT is shown below (incomplete because the report has not been publically released). See my notes (Research Notes)

The Executive Summary concludes: Although not part of the core DATA FAIRPORT convention, the need for minimally one reference implementation has been recognized as essential to achieve a wide adoption, and a functional demonstrator should emerge in the coming months from voluntary contributions of the attendees.

So I decided to try to implement the concepts and sample content provided above by first "mashing up" all the information below as follows:

Interestingly, the AIDS paper says: Search strategy and selection criteria: references for this article were identified through searches of PubMed for articles published from 1985, by use of the terms HIV, SIV, AIDS, immune activation, immunity, pathogenesis. Articles resulting from these searches and relevant references cited in those articles were reviewed. Articles published in English were included.

The last two are publications that have been rendered in this MindTouch wiki to make them easily accessible and reusable in a Knowledge Base.​

I looked for some sample data at the main Dutch Technical Center for Life Sciences web site and its Media Wiki but was unable to find any. I inventoried the Data FAIRPORT Dropbox files in a spreadsheet that also contained the Knowledge Base of this MIndTouch page. See below. These are in both relational and graph (RDF triple-subject, object, predicate) data set formats.

BRAINSpreadsheetKnowledgeBase.png

BRAINSpreadsheetDataFairportConference.png

My goal is to visualize these data sets using tools like Spotfire Network Analytics and NodeXL. My Spotfire Cloud Library contains about 300 files each containing one or more data sets that I share openly. A simple Data Fairport node could be constructed by each research team making all their content data in a wiki with well-defined (and hopefully permanent) URLs, putting that data in a spreadsheet or relational or graph database, and publishing it to a cloud library with public access for reuse.

This effort might be viewed as a Skunkport: a rapid prototype for a revolutionary infrastructure to foster a competitive infrastructure for preserving, discovering and reusing research data and tools that identifies a use case, develops and early functional spec, develop a prototype that wires existing resources together, and answers questions from the community. It answers the following questions:

  • What are you doing? A "Skunkport"
  • Why are you doing it? The March 4th Meetup
  • What will happen? What will people experience? I will report on that after the March 4th Meetup!
  • How will it help? What will be the benefits? A simple example of how Joint Biomedical Big Data Research could be captured , published, and preserved.

I just added Data Science for VIVO as another "Skunkport" example.

MORE TO FOLLOW AFTER THE MARCH 4TH MEETUP

A recent meeting of the President’s Council of Advisors on Science and Technology (PCAST) discussed "Scientific reproducibility and big data" (see Research Notes) which requires the kind of structured scientific data publishing described and illustrated here for another scientist to come along and essentially audit previous scientific work. Dr. George Strawn's excellent slides for the Data Fairport Unconference mentioned that US Federal agencies have submitted their "initial plans" for public access to scientific data to the White House Office of Science and Technology Policy (OSTP). See An Open Data Policy

The recent National Geographic Secrets of the Brain says:

  • New technologies are shedding light on biology's greatest unsolved mystery: how the brain really works
  • Mind Machine: An engineer wears a helmet of sensors at the Martinos Center for Biomedical Imaging- part of a brain scanner requiring as much power as a nuclear submarine.
  • The Color of Thought: The brains's many regions are connected by some 100,000 miles of fibers called white matter-enough to circle the Earth four times.
  • Anatomy of a Mystery: a high-resolution image reveals white matter fibers arranged in a mysterious grid structure like longitude and latitude lines on a map.
  • The Glow of Memory: "When you form a memory, there's a physical change in the brain," says Don Arnold, of the University of Southern California.
  • Browsing down to single nerve cells may finally provide answers to basic questions about the brain.
  • Jennifer on the Brain: Caltech and UCLA scientists use pictures to study how the brain processes what the eyes see. In 2005 they found an individual nerve cell that fired only when subjects were shown pictures of Jennifer Aniston. Follow-up studies suggest that relatively few neurons are involved in representing any given person, place, or concept, making the brain staggeringly efficient at storing information.
  • The secret to many diseases may be hiding in the brain's genes, as they shut down or switch on abnormality.
  • Intimate View: A stack of 10,000 photomicrographs of a mouse brain form a 3-D model no larger than a grain of sand. A human brain visualized at this level of detail would require an amount of data equal to all the written materials in all the libraries of the world.
  • Deep Brain Dive: For the first time scientists can visualize how neurons actually connect with one another.
  • Half the World's Hard Drives: Visualizing neurons at their level of activity requires unprecedented computing power. The storage capacity needed to produce mouse brain images: 450,000 TB; human brain image:1.3. billion TB; and Global digital storage: 2.7 billion TB
  • To see the brain, scientists at Stanford University begin by making it as transparent as a glass marble. If their models are accurate, the researchers will be able to literally read the mind of a mouse.

Albert Einstein said: If you can’t explain it simply, you don’t understand it well enough..

Interview Innovation International

Source: http://datafairport.org/images/docum...%20Article.pdf (PDF)

Data Fairport: enabling global exchange of research data

On the 13-16 of January 2014 a varied group of stakeholders from the global life sciences data field met in Leiden, The Netherlands, to discuss how they could create an environment that enables effective sharing and re-use of biological research data. International Innovation spoke with the initiators and some of the attendees to discuss how public and private stakeholders and international communities can work together to enable secure and effective data ‘interoperability’

Key Presenters

Dr Ruben Kok

RubenKok.png

Director of the Dutch Techcente for Lifesciences, Director of the Netherlands Bioinformatics Centre

“Today’s data-intensive life science research poses common challenges to biologists across sectors to integrate, analyse, securely share and manage their research data. We need a global environment that enables biologists and solution providers to make data sharing a common practice.”

Dr George Strawn

GeorgeStrawn.png

Director of the Federal Networking and Information Technology Research and Development

“I am an observer from the US federal government and especially interested in this conference given the recent requirement to provide open access to scientific results funded by the US federal government covering both scientific articles as well as the supporting data.”

Professor Dr Barend Mons

BarendMons.png

Professor of Biosemantics at Leiden University Medical Centre, Head of the Dutch ELIXIR Node and an Integrator for Life Sciences at the Netherlands eScience Center

“I’m one of the co-organisers of the workshop and my main interest is modern big data-driven science, knowledge discovery with the computer based on experimental data and narrative information.”

Our ability to generate enormous amounts of data has grown much faster than the capacity to store, link and analyse them.”

Life science, the study of living organisms, is built on a tradition of cataloguing biological facts. As such, biology is inherently data-intensive, but the digitalisation of information and increased compute and storage capacity of computers and the speed of data transport across networks has created a new age of data-driven and computational opportunities for the life sciences. From the molecular level through to the organism and population as a whole, data capture covers every complex interaction and builds a picture of mechanisms of disease and drivers of behaviour to a resolution never previously imagined. This increases data variety and complexity as much as it drives up data volumes, and this presents many challenges: how can data be successfully integrated, analysed, securely shared and managed among scientists across many different institutions and sectors? These challenges are social as much as they are technological. What is ultimately required is a global and sustainable data sharing environment that makes it easy to publish research data, discover data sets and re-use them. Above all, this requires the global adoption of a series of standards that make data and software talk to each other: they must be ‘interoperable’.

Dr Niklas Blomberg

NiklasBlomberg.png

Director of ELIXIR, the European Infrastructure for Life Science Data

“My role is to participate both as a representative of the ELIXIR infrastructure to help to picture this in the overall landscape as well as understand these community efforts and develop standards going forward.”

“Interoperability, that’s really the key. And the other one is longevity – how do we sustain the data going forward?”

Dr René van Schaik

RenevanSchaik.png

CEO of the Netherlands eScience Center

“We have co-funded the Data Fairport conference as this is in our view a great contribution to stimulating eScience in The Netherlands, especially when it comes to data stewardship and software sustainability.”

“Once we have this system in place, it opens up the possibility for all sorts of services, but we need to start with a common vision, a common ground and a common set of rules to work from.”

Could you briefly explain the main topic of the Data Fairport conference?

BM: The backdrop of the conference is the immensity and complexity of contemporary biological data, and the requirement for data to be understandable to both humans and computers. This poses a daunting challenge where close collaboration between various stakeholders in the life sciences is essential.

RS: The amount of data that are being generated now and the new techniques in biology that are being used are generating a tsunami of very diverse data. More and more funders are demanding sound data stewardship plans for every grant awarded through their funding system, because they realise that they create a lot of value with these data.

GS: At the highest level what we’re looking to establish is the interoperability of heterogeneous data sets as we can’t expect the data collected by thousands of investigators to be in a similar format.

Why do you think this issue of ‘big data’ in the life sciences has become so prominent of late?

GS: It’s only relatively recently that the disk storage has been large and cheap enough; that computers have been fast enough; and that the networks have had wide enough bandwidth that we could seriously think about storing most things. Now that we can do all this, we see that there are great advantages if we can develop the software to support the hardware and improve data mining into this tremendous source of scientific data.

NB: It is also expensive and a long-term commitment, so you do think twice before you embark on this journey.

BM: Our ability to generate enormous amounts of data has grown much faster than the capacity to store, link and analyse them. Only now are we catching up and creating the data-handling capacity that should have been developed at the same rate as the capacity to generate this data.

What do you see as the main challenges ahead? Is it the technology or are other aspects just as important?

BM: One of the conclusions that we can take from the conference is that the technology needed to make this happen is essentially 99 per cent there. However, to start an initiative that is also endorsed by major funders, the social hurdles that must be overcome are equally important. This includes the challenge to align people – now that everyone realises the importance of data, the danger is that we will get 500 different initiatives to solve the same problem.

RK: With this conference we have taken an important step forward as we have quite a representative group here from many different disciplines and stakeholders. If this group gets behind this initiative, people will take it as a very serious attempt.

NB: I think that at an individual level, most biologists take great care in preserving data both for their own purposes and to make sure publications are well-founded and reproducible. What they require is guidance and support in how to do this with larger datasets and how to follow the new data stewardship requirements from funding organisations.

Making data accessible in itself isn’t hard: you just take a hard drive and hook it up to the internet. Making data accessible so that other people can find and use it is difficult. There are a lot of ongoing community initiatives, but there are no widely accepted guidelines for how to do this on a European, let alone global, scale. Researchers are simply looking for support in how to do the right thing.

RK: One of the things we concluded at the conference is that we need a better rewarding mechanism for entering data into public data systems – this is one of the social elements that we need to try and address.

BM: One important ‘non-technical’ hurdle is that a new profession – the data scientist – is emerging as a key requirement. There is no way that all biologists or any other researchers can become experts in handling data. The Economist predicts a shortage of about 190,000 people globally in 2017-18 that know how to deal with data. However, trusting your data to an outsider is not easy, and there is no structure in universities at the moment to train data specialists and give them permanent positions.

Can you each give your personal vision for the Data Fairport in the long run?

GS: At the highest level, I am hoping that we will develop the technology and the social willingness to work on interoperability of heterogeneous datasets so that we can combine them in novel ways. If we can truly structure scientific data, we will be able to conduct new science.

NB: Interoperability, that’s really the key. The other one is longevity – how do we sustain the data going forward? To achieve this, there needs to be agreement on key standards – I would like to see how all of these community efforts are coming together and how we can define a process and a strategy for interoperability.

RS: Once we have all that in place, it opens up the possibility for all sorts of services, but we need to start with a common vision, a common ground and a common set of rules to work from.

BM: With Niklas and George we have representatives from European and US authorities attending. Also at the conference is Professor Abel Packer, who is running Scientific Electronic Library Online (SciELO) for 16 countries in South America. Currently, South Africa and China are also in the process of building up SciELO instances. I think linking SciELO to the Data Fairport backbone would create a great opportunity for millions of bright minds in developing countries to make a career and get into the scientific mainstream.

GS: I would just add optimistically that science already has a community of sharing via research articles, so all we have to do is extend that concept from just articles to articles and datasets. It will be very important for universities and other funders to expand the concept of faculty rewards to include rewards for publishing data, just as now faculties are rewarded for publishing their research articles. This could also be extended to include software.

What do you hope to have achieved at the end of this week, coming out of this conference?

BM: We are quite modest in our short-term objectives. We have another meeting planned in The Netherlands in September, coinciding with a plenary session of the so-called Research Data Alliance. The first step is to form a steering committee to reach consensus about the minimal requirements and to develop a really solid plan to be presented at that meeting. In parallel we want to raise some initial funds and build some prototypic implementations; so by the end of this year we are ready to start building this thing globally.

NB: There are many initiatives already ongoing and there is a need to find a way to bring the community together and represent the needs of ordinary researchers as well as the longer-term aims of the funding organisations.

GS: If history is any guide, we’ve seen some community activities with similar aspirations work in the past. In the 1980s-90s for example, a group called the Internet Engineering Task Force arose out of the original foundations of the internet to make community decisions on internet standards and protocols. Then, in the 90s and the 2000s, the World Wide Web consortium arose to do the same thing for standards and protocols associated with the Web. Both of these activities are what you would call non-profit community-orientated activities; but they have produced key platforms upon which other entrepreneurs have been able to found very important businesses in service to science and society.

RK: Most of the technological solutions and standards are floating around already. We don’t want to re-invent wheels here. So, in the next nine months a major role for our group will be to visit the major players and invite them to participate in a comprehensive and coherent approach to foster data sharing and re-use.

The Data Fairport Conference

13-16 January 2014

25 experts from the worlds of research infrastructure and policy, publishing, the semantic web and more were brought together for four days to discuss how best to deal with life science data and proceed with the Data Fairport. Here, International Innovation outlines their conclusions:

AIM

The Data Fairport aims to provide a minimal (yet comprehensive) framework in which current issues in data discoverability, access, annotation and authoring can be addressed. The Data Fairport will not dictate a single platform or a tightly integrated data infrastructure, but will instead focus on conventions that enable data interoperability, stewardship and compliance against data and metadata standards, policies and practices.

APPROACH

It was proposed that the convention for data and model services interoperability should be based on the minimal ‘hourglass’ type approach, which defines the absolute minimum that is required to achieve interoperability. This is similar to the approach that underpins the internet, the web and other robust, heterogeneous yet interoperable infrastructures. We shall therefore focus on the specification of lightweight interfaces, standard protocols and standard formats, that are founded (where possible) on existing community standards.

SCOPE

The Data Fairport is not about the development of additional standards, but rather:

  • Adoption of standards
  • Communication of standards
  • Simplification of standard interoperation
  • Adoption of cross-cutting standards for provenance, versioning, identity and dependency for data and for metadata covering identifiers, formats, checklists and vocabularies
  • Interoperation of data services
  • Reconciliation of evolving standards and the datasets organised or annotated by them
  • Minimal models of investigation for grouping results
  • Metadata required to link data with analytics (notably models)
  • Data citation – mechanics, adoption, recognition

KEY DRIVERS

Understanding complex biological systems remains a great challenge for the life sciences. Perhaps the most immediate challenge is the human body, which has between 20,000-25,000 genes located on 46 chromosomes, whose expression is modulated by large amounts of additional genetic elements that constitute multiple layers of regulatory control. The emergence of high-throughput ‘omics’ research to examine these components and correlate them to health and disease has led data production to increase exponentially. The increased potential to sequence

many genes and the cost and time of sequencing rapidly decreasing has led to an overwhelming deluge of data production in medical science. In just 10 years the price of human genome sequencing has diminished from €4 billion (the cost of the Human Genome Project) to around €1,000 thanks to advances in sequencing technologies.

Such complex biological systems and the enormous volumes of data being generated impact greatly on the life science community, including biomedical researchers and clinicians, but also those working for scientific communication outlets, such as publishers. The PubMed database, for instance, has received 20 million biomedical research articles to date – which amounts to one new submission every 40 seconds. This surge of data could lead to inaccuracies in research; new and potentially important data slipping through the net; and expensive research projects needing replication. Proper use of these data, however, has the potential to generate an array of exciting new discoveries. If scientists are to utilise all existing and incoming life sciences data, a shift from the ‘old (data poor) paradigm’ to a ‘new (data intensive) paradigm’ will be required. This will potentially require a total shift in the way science is performed and scientific success is measured.

“Most of the technological solutions and standards are floating around already. We don’t want to re-invent wheels here”.
– Dr Ruben Kok

Is this concept unique to life sciences? Do you think it is a model that could be rolled out for other disciplines?

RS: You can definitely apply it to other areas of science. For instance, in astronomy, the datasets are even bigger, but they are also simpler, and the same is true for nuclear physics. There is a lot of noise in the data and they throw a lot away just to get at the interesting parts. Life science is special because of the enormous variety of data that we have to deal with. When we get to the medical domain and start dealing with patient data, there is the added complication of privacy as well.

NB: Social sciences and life sciences have a lot of things in common in the health domain, as they also deal with highly sensitive personal data, so ethical and privacy issues are very similar. Maybe it should be added here that we are talking about life sciences as if they are one discipline, but they are still quite a heterogeneous group, and multidisciplinary their own right.

GS: I would just add that that not only are these technologies ultimately applicable to all science and other scholarly domains, their ultimate value will hopefully be to promote interdisciplinary research. Overlaps between chemistry and biology are well known; and between biology and geology now as climate change is considered – if we can use electronic technology to help us articulate between and among these scientific fields, I think we will create entire new tiers of knowledge.

BM: If you create a contingent of data experts with no specific disciplinary bias, those will be the living connectors between areas in a way – by implementing the same approaches throughout disciplines. So I think in the end what we are discussing here has implications not just for the life sciences but for the wider scientific community as well.

INTELLIGENCE

THE DATA FAIRPORT OBJECTIVES

The Data Fairport initiative focuses on agreeing conventions that enable data interoperability, stewardship and compliance against data and metadata standards, policies and practices. It does not dictate a single platform, nor a tightly integrated data infrastructure, but focuses on the specification of lightweight interfaces, standard protocols and standard formats to define a set of minimal requirements and combining existing community standards as much as possible.

CONFERENCE PARTICIPANTS

Dr Myles Axton, Nature Genetics

Drs Arie Baak; Drs Albert Mons, Phortos Consultants, Euretos

Dr Jan Willem Boiten, Dutch Techcentre for Lifesciences(DTL), CTMM-TRaIT

Professor Barend Mons, Leiden University Medical Center,DTL, ELIXIR NL

Dr Niklas Blomberg, ELIXIR Hub

Olivier Dumon; Dr IJsbrand Jan Aalbersberg; Gaby Appleton, Elsevier

Professor Carole Goble, University of Manchester, ELIXIR UK

Professor Jaap Heringa, VU University Amsterdam, DTL, ELIXIR NL

Dr Bengt Persson, BILS, ELIXIR Sweden

Dr Thierry Sengstag, SIB Swiss Institute of Bioinformatics, ELIXIR-CH

Dr Maurice Bouwhuis, SURFsara

Professor Anthony Brookes, University of Leicester, Gen2PHEN/GWAScentral

Professor Tim Clark, Harvard Medical School, Mass. General Hospital, Force11

Dr Michel Dumontier, Stanford University, NCBO, Bio2RDF

Professor Frank van Harmelen; Dr Paul Groth, VU University Amsterdam,W3C, Open PHACTS

Dr Rob Hooft, DTL, Netherlands eScience Centre

Professor Joost Kok, Leiden University

Dr Ruben Kok, DTL, Netherlands Bioinformatics Centre (NBIC)

Professor Johan van der Lei, Erasmus Medical Center, EMIF

Dr Rene van Schaik; Dr Scott Lusher, Netherlands eScience Center

Dr Erik van Mulligen, Erasmus Medical Centre, S&T

Professor Abel L Packer, ScieLO, Brazil

Dr Ted Slater, YarcDATA

Dr George Strawn, National Coordination Office/NITRD (USA)

Dr Morris Swertz, Groningen University Medical Centre, DTL, BBMRI-NL

Drs Jan Velterop, Acknowledge

Dr Mark Wilkinson, University of Madrid, SADI

CONTACT

Dr Barend Mons
Dutch Techcentre for Life Sciences (DTL)
E barend.mons@dtls.nl
http://www.dtls.nl

WWW.RESEARCHMEDIA.EU 101

Jointly designing a DATA FAIRPORT

Source: http://datafairport.org/images/docum...rt%20final.pdf (PDF)

Source: http://www.dtls.nl/dtl/news/fairport-workshop.html

On the 13th to 16th of January 2014 the Dutch Elixir node, in cooperation with the Netherlands eScience Center and the Lorentz Center, organized an ‘unconference’ titled:  ‘Jointly designing a Data FAIRPORT’. During these four days key aspects of a global infrastructure for effective data publishing, discovery, sharing and re-use for eScience experimentation were discussed. The workshop brought together 25 high level participants representing leading research infrastructures and policy institutes, publishers, semantic web specialists, innovators, computer scientists and experimental (e)Scientists. 

Through a mix of moderated plenary sessions and break out groups the context, user needs, technological challenges, access & business models, funding requirements and governance of such a global initiative were discussed. The outcome of the meeting was a remarkable consensus about the best way forward. All participants agreed that a global infrastructure for professional data publishing, discovery, exchange and re-use is essential for effective data driven research, as well as addressing the data stewardship requirements that are increasingly demanded by science funders. 

The principle outcome of the meeting is that a backbone will be designed that will enable global interoperability of data. The key requirement for this backbone will be to allow as a very minimum that computers can ‘independently’ discover all available data sets for a specific research requirement. At the core of this ‘DATA FAIRPORT ecosystem’ will be a minimal protocol that will define basic semantic interoperability of datasets where possible using already endorsed or emerging community standards and protocols. 

Standards and protocols at the core of the DATA FAIRPORT would be endorsed by ELIXIR and sister initiatives around the world (such as NCO in the USA and SCIELO in Latin America) and DATA FAIRPORT compliant services can also seek endorsements from these authorities. The DATA FAIRPORT will also answer part of the needs expressed in the Interoperability Programme of Work of ELIXIR and the ‘Principles of data management and sharing at European Research Infrastructures’ under preparation in BioMedBridges. 

The meeting was financially and logistically supported by the Netherlands eScience Center, a partner in the Netherlands ELIXIR Node DTL, and moderated by Phortos Consultants.  

For more information please contact: FAIRPORT@dtls.nl

Executive summary

DatafairportFianlReportCoverPage.png DatafairportFianlReportFigure1.png

This document recapitulates the discussions which took place during the first FAIRPORTunconference on the 13-16th of January 2014. The meeting aimed at defining the DATA FAIRPORT, a minimal (yet comprehensive) framework in which current issues in data discoverability, access, annotation and authoring can be addressed.

The DATA FAIRPORT will not dictate a single platform or a tightly integrated data infrastructure. Rather it will focus on conventions that enable data interoperability, stewardship and compliance against data and metadata standards, policies and practices. It
was proposed that the convention for data and model services interoperability should be based on the minimal “hourglass” approach, which is the same as the approach that underpins the internet, the web and other robust, heterogeneous yet interoperable infrastructures. The hourglass focuses on the specification of lightweight interfaces, standard protocols and standard formats to define a ‘minimal Data Fairport scope’. It was proposed that the conventions for data and model metadata descriptions be founded on community standards for: identifiers, formats, checklists and vocabularies.

The DATA FAIRPORT is not the about the development of yet more standards, it is about

  • the adoption of standards
  • the communication of standards
  • the simplification of standard interoperation
  • the adoption of cross-cutting standards for provenance, versioning, identity and dependency for data and for metadata
  • the interoperation of data services (is that another workstream)
  • the reconcilation of evolving standards and the datasets organised or annotated by them
  • the minimal models of investigation for grouping results
  • the metadata required to link data with analytics (notably models)
  • data citation ­ mechanics, adoption, recognition 

Although not part of the core DATA FAIRPORT convention, the need for minimally one reference implementation has been recognized as essential to achieve a wide adoption, and a functional demonstrator should emerge in the coming months from voluntary contributions of the attendees.

List of unconference Participants

Source: http://www.dtls.nl/dtl/about-us/participants-list

  • Dr. Myles Axton - Nature Genetics
  • Dr. Niklas Blomberg - Director of ELIXIR Hub (UK)      
  • Dr. Maurice Bouwhuis - SURFsara
  • Prof. Anthony Brookes - Uni Leicester, GEN2PHEN-Alliance & BioShaRE (GWAS-Central, Cafe Variome, OmicsConnect)
  • Dr. Tim Clark - Harvard medical School and FORCE11
  • Olivier Dumon - Elsevier (managing director), Netherlands
  • Prof. Michel Dumontier - Stanford University, USA
  • Prof. Carole Goble - Deputy head of UK ELIXIR node, UNIMAN
  • Dr. Paul Groth - VU University Amsterdam and Open PHACTS
  • Prof. Frank van Harmelen - VU University Amsterdam and Open PHACTS
  • Prof. Jaap Heringa - Deputy Head of ELIXIR Node NL
  • Prof. Joost Kok - Prof. of Computational Science, Leiden University
  • Dr. Ruben Kok - Director of Dutch Techcentre for Life Sciences (DTL)
  • Prof. Johan van der Lei - Erasmus Medical Center & coordinator or EMIF
  • Prof. Barend Mons Convener, Head of ELIXIR Node NL
  • Abel L. Packer - SciELO, Brazil, Latin America
  • Dr. Bengt Persson - Head of ELIXIR node Sweden / BILS
  • Dr. Thierry Sengstag - SIB (representing Prof. Ron Appel)
  • Dr. Ted Slater - YarcDATA and semantic web specialist USA
  • Dr. George Strawn - Director of NCO/NITRD (USA)
  • Drs. Jan Velterop - Acknowledge and OA specialist, UK
  • Dr. Mark Wilkinson - Semantic tool specialist, Spain and SADI

Moderators:

  • Drs.Mr. Arie Baak, B.A. - Netherlands eScience Center and Phortos Consultants
  • Dr. Jan Willem Boiten - Center for Translational Molecular Medicine (CTMM) and Dutch Techcentre for Life Sciences (DTL)
  • Dr. Rob Hooft - Dutch Techcentre for Life Sciences (DTL)
  • Dr. Scott Lusher - Netherlands eScience Center
  • Drs. Albert Mons - Phortos Consultants and Euretos
  • Dr. Erik van Mulligen - Erasmus Medical Centre and S&T (general rapporteur)
  • Dr. Rene van Schaik - CEO ad interim, Netherlands eScience Center
  • Dr. Morris Swertz - Dutch Techcentre for Life Sciences (DTL)

Invited but unable to attend & in some cases represented or attending only a part:

  • Dr. IJsbrand Jan Aalbersberg - Elsevier Senior Vice President
  • Prof. Ron Appel - Director SIB & HoN ELIXIR, Switzerland (rep: Thierry Sengstag)
  • Gaby Appleton - Elsevier managing director strategy
  • Prof. Phil Bourne - Associate Director for Data Science, NIH
  • Prof. Mark Musen - NCBO/Stanford, USA (rep: Michel Dumontier)
  • Dr. Ir. Anwar Osseyran - Director of SURFsara (rep: Maurice Bouwhuis)
  • John Wilbanks - Consultant NITRD and Sage Bionetworks
  • Dr. Bryn W-Jones - Coordinator Open PHACTS and consultant

0 Introduction

https://www.dropbox.com/sh/6uil51i7kedex99/nifGwKM_FY

This is the conference report of the ‘Jointly Designing a DATA FAIRPORT’ conference that was held 13-‐16th of January 2014 in Leiden, The Netherlands. The pivotal challenge addressed in this setting is related to the rapid shift to data driven (e)Science triggered by the ability to create ever larger and more complex data sets for knowledge discovery. As a response to the resulting waste of valuable data in the old publishing paradigm many funders now request data management and dissemination plans. However, no professional, global and user-friendly environment exists to enable effective data publishing, rediscovery, sharing and re-­‐use. The conference brought together a unique group of experts that have been addressing this topic for many years and now may collectively have the network and the solutions to consolidate their efforts.

The conference proved to be very interactive and collaborative and throughout the four days the topic of the Data Fairport was discussed from different angles. In this process progressive insight was gained and broad consensus was established. This report tries to focus on emphasising these aspects of the conference.

It was decided to narrow down the reported outcome to the crucial insights that evolved in the various sessions. As a result of this approach, the report is not strictly chronological. All conference material is available at the following link: https://www.dropbox.com/sh/6uil51i7kedex99/nifGwKM_FY

List of unconference Participants

 

Participant Affiliation/role
Dr. IJsbrand Jan Aalbersberg Elsevier Senior Vice President
Gaby Appleton Elsevier managing director strategy
Dr. Myles Axton Nature Genetics
Dr. Niklas Blomberg Director of ELIXIR Hub (UK)
Dr. Maurice Bouwhuis External Relations Officer, SURFsara (representing Dr. Ir. Anwar Osseyran)
Prof. Anthony Brookes Uni Leicester, GEN2PHEN-­‐Alliance BioShaRE(GWAS-­‐Central, Cafe Variome, OmicsConnect)
Prof. Tim Clark Harvard medical School, Mass. General Hospital and Force11
Olivier Dumon Elsevier (managing director), Netherlands
Dr. Michel Dumontier Stanford, NCBO and Bio2RDF (representing Prof. Mark Musen)
Prof. Carole Goble University of Manchester, Deputy head of UK ELIXIR node.
Dr. Paul Groth VU University of Amsterdam, W3C and Open PHACTS
Prof. Frank van Harmelen VU University Amsterdam and Open PHACTS
Prof. Jaap Heringa VU University of Amsterdam, Deputy Head of ELIXIR Node NL
Prof. Joost Kok Prof. of Computational Science, Leiden University, LIACS
Dr. Ruben Kok Director of Dutch Techcentre for Life Sciences (DTL)
Prof. Johan van der Lei Erasmus Medical Center coordinator or EMIF Prof. Barend Mons Convener, Leiden University Medical Center, Head of ELIXIR Node NL
Abel L. Packer SciELO, Brazil, Latin America
Dr. Bengt Persson Head of ELIXIR node Sweden / BILS
Dr. Thierry Sengstag Swiss Institute of Bioinformatics (representing Prof. Ron Appel)
Dr. Ted Slater YarcDATA and semantic web specialist USA
Dr. George Strawn Director of NCO/NITRD (USA)
Drs. Jan Velterop Acknowledge and OA specialist, UK
Dr. Mark Wilkinson Semantic tool specialist, Spain and SADI
   
Moderators:  
Drs. Mr. Arie Baak Netherlands eScience Center and Phortos Consultants
Dr. Jan Willem Boiten Dutch Techcentre for Life Sciences (DTL) and CTMM
Dr. Rob Hooft Netherlands eScience Center and DTL
Dr. Scott Lusher Netherlands eScience Center
Drs. Albert Mons Phortos Consultants and Euretos
Dr. Erik van Mulligen Erasmus Medical Centre and S & T (general rapporteur)
Dr. Rene van Schaik CEO ad interim, Netherlands eScience Center
Dr. Morris Swertz Dutch Techcentre for Life Sciences (DTL) and BBMRI-­‐NL

1 Vision, Mission & Scope

Key insights from the introductory session to define the scope and mission of the FAIRPORT included the critical notion that the FAIRPORT needs to build on a (formal) relationship with existing initiatives, which is one of the most important aspects of the DATA FAIRPORT. Rather than creating another forum to augment all existing initiatives out there already we need to find a way to make all ‘Enablers' part of this initiative. The ‘extra’ dedicated role of the FAIRPORT initiative would be to create the social as well as the technical/protocol ‘backbone’ where users and enablers can effectively build and operate together.

Enablers were defined later on as: stakeholders with a solution to one of the sub­‐challenges in the FAIRPORT. These could be semantic solutions like ontologies (i.e. NCBO, ORCID,VIVO) standards (i.e. W3C, ELIXIR, NCBI,) technological solutions as well as more ‘policy making’ stakeholders such as the funders and standard setting bodies. For the very reason that enablers range from purely technical implementers to standards bodies to research policy makers and funders, the ‘backbone’ of the FAIRPORT should have both a technical and a social character.

Examples of the mix of social and technicial challenges are that (lack of) credit for data sharing has been defined as a major stumbling block. However, enabling citation of individual data sets or even the associative elements within them is first of all a technical issue (interoperability, citability, and meaningful metrics). Yet, the technical ability to track exactly where data came from, where they are now and who has succesfully reused them for further discoveries is not enough. The social aspects are equally important such as convincing ‘the deans’ to take the altmetrics for data seriously in career development decisions. For this very reason we can also not separate data stewardship from software stewardship. Algorithms to analyse data, infrastructure, standards and software to deal with authentication, authorisation, security and privacy can not be seen and developed in isolation from ‘what we want with the data’.

It was therefore decided early on to focus on the optimal ‘backbone’ at the narrow waist of the Hourglass and design everything based on well defined used cases, with benefits for all conceived user categories. It was also decided to refrain from too detailed plans at this stage, but rather define what will be worked out in detail by dedicated and specialized interactive work packages after the meeting.

Figure 1 The hourglass used for discussions at the meeting

DatafairportFianlReportFigure1.png

A final remark on Vision: The FAIRPORT principle is generic for all eScience approaches, also outside of the Life Sciences. However, we will focus entirely on Life Sciences first. That is also the interest area of most of the participants and the funders that were invited for the wrap up of the unconference. Also, we all believe that the complexity and the wildly variable character of life sciences data, more so than their sheer volume are a major challenge, putting Life Sciences in an ‘exemplar’ situation for eScience data needs.

Jumping now to the pre-final session of the conference an ad hoc session was held on the joint formulation of draft ‘elevator pitches’ for future stakeholder visits. Here we use one example to provide a flavor of what what created in two 10 minute sessions:

  • Despite massive investments in research, most research data and tools become inaccessible with time, and even if available, they often cannot easily be reused for new discoveries.
  • FAIRPORT provides a plug‐and­‐play environment in which researchers can share their research data and tools, using existing community standards.
  • FAIRPORT ensures the long­‐term preservation of results, promotes scientific integrity and reproducibility, and accelerates the pace of scientific discovery.

We do realise that we may adapt these elevator pitches when visiting different stakeholders,ranging from ‘funders/policy makers’ via ‘Enablers’ (with a vested interest already) to end users such as institutional repository holders and individual experimental scientist or bioinformaticians. The other drafted elevator pitches are attached in Appendix I.

2 Stakeholder Needs Requirements

It was decided that the construction of the DATA FAIRPORT initiative needs to be guided by concrete and compelling use cases demonstrating early on that this federation of efforts will significantly accelerate the development of a global data interoperability environment. Hence, this section is structured by giving examples of the use cases proposed by the different groups, followed by a selection of the consequential high-­‐level requirements and broad technical specification of individual FAIRPORT instances and the Network of Interconnected FAIRPORTS. A more detailed overview is provided in the Appendices.

Taking the perspective of various stakeholders the following needs can be articulated

Researchers Publishers Policy Makers
  • Easier collaboration with others who also have data,
  • Peer review of experimental design (pipeline)
  • Early sign up for research data (as collaborator)
  • Embargo on the data before paper submission
  • Much easier data integration
  • Speed up conputer assisted discovery
  • Have data cited for award 
  • Recruitpeer reviewers for data
  • Smooth data submission system
  • Expose content in computer readable format (text + data)
  • Attract citations to articles via data citation
  • Develop direct data citation methods and metrics
  • Enter mainstream of data driven science
  • Develop new business models
  • Make visible the data sets a funder has sponsored
  • Be citable as funder for data
  • Shorten turn round time; reuse data before it is published
  • Increase success of existing projects • Generate societal benefits (and not only for the individual researcher)
  • Provide professional tools for compliance with data deposition/stewardship plans
  • Enable a seamless fit and/or transition from existing practises

The basic use case/needs that the DATA FAIRPORT should address can be summarised as follows:

1. Data Stewardship: A user wants to upload data, provide standard-compliant metadata descriptions, and submit to archive

2. Data Exploration: A user wants to explore the relationships among submitted data

a. standards used for metadata annotation
b. distribution of data submitters by affiliation, location

3. Knowledge Discovery: A biomedical researcher wants to identify heart genes involved in the response to low-oxygen conditions

4. Efficient data submission system: requesting data deposition and sharing in a data management or stewardship plan, requires easy and user-­friendly submission of datasets in required formats for publishing, sharing and re-­use

This basic use case is a highly condensed view of a long list of potential use cases that could be met by the DATA FARPORT which is available in Appendix II.

Against the back drop of these use cases user requirements were determined. Below is a high level overview of the key requirements that have been articulated. The detailed set of requirements is available in Appendix III. DATA FAIRPORTS:

1. must have meta data, can contain data, can contain code.

2. need to be seen as fair: analogy to github, can add and fork without barrier, pull/merge, versioning, sharing/private.

3. need a minimal meta data model to define the data description templates (the hard bit).

4. need minimal tools to validate (meta) data confirms to the templates.

5. need minimal annotation tools so review, qc, certification, other contributions can be traced.

6. need to have communication interfaces (API).

7. require fair governance (e.g. via ELIXIR, other ESFRI).

8. need SLA’s and contracts with code and software providers, and software carpentry institutions such as SSI, specifically for software for which FAIRPORT data are the direct substrate.

9. 3rd party add-ons that provide many types of specific services that make use of the DATA FAIRPORT infrastructure!

A very important consideration in creating the DATA FAIRPORT is the principle that it leverages standards and capabilities that are already in place and promote their wider adoption.

The following list of potential services that could be supported in the DATA FAIRPORT were suggested by a spontaneously formed ‘SKUNK’ tech team. The services in bold are deemed most important:

  • user and data identity services
  • authentication services
  • terminology services
  • metadata annotation
  • metadata aggregator
  • data archiving services
  • data redundancy (torrents)
  • data conformance services
  • data import/export services
  • content aggregation services
  • access “altmetrics”
  • search services
  • browse services
  • query services
  • linking services
  • provenance services
  • social media services
  • notification services
  • analysis services

These services would be offered in the context of a collaborative DATA FAIRPORT network where each FAIRPORT will be customizable to serve that organization’s needs. DATA FAIRPORTS may personalize the user experience by implementing their own user interface and by developing or selecting FAIRPORT-compliant services of interest. All FAIRPORTs will have the capability to share their metadata concerning data and services.

A key question that should be answered as the DATA FAIRPORT develops further is to what extent it would be involved in actual data sharing activities and the incentives to do so. The central consideration would be what is essential and should be part of the ‘waist’ and what should be done by parties outside and collaborating with the DATA FAIRPORTS. We should be very careful not to create scope creep at the ‘waist’. In essence the DATA FAIRPORT initiative should focus on the ‘minimal Data Fairport scope’ (the green box) in the hour-glass picture in figure 1 and ‘actively engage’ Enablers along the vertical green lines.

3 Solutions & Technology

The FAIRPORT meeting did obviously not happen in splendid isolation. It was partly conceived to inform the ELIXIR work stream developing a strategy for data interoperability, and will also respond to the developing inter ESFRI policy documents on data management and cross ESFRI sharing, under preparation in the context of BioMedBridges, as well as comparable initiatives in the United States and other continents.

The participants of the workshop came to the conclusion early on that for the high level architecture and services depicted and described in the previsous section to materialise, we have to follow the established minimal ‘hourglass’ approach from computer science driven design. A FAIRPORT will not dictate a single platform or a tightly integrated data infrastructure. Rather it will focus on minimal but strict conventions that enable a wide variety of services and applications needed to realize computer-­friendly as well as human-­friendly data interoperability, stewardship and compliance against data and metadata standards, policies and practices.

The hourglass focuses on the specification of lightweight interfaces, standard protocols and standard formats. ELIXIR’s workstreams on Tools and Data Interoperability are also clearly moving in this direction. The conventions for data and model metadata descriptions should be founded on community standards for identifiers, data formats, checklists and controlled vocabularies. Conventions for other aspects are to be defined and the more they are distant to the narrow waist of the hourglass the more diverse they may become. However by compliance with the minimal protocols and standards, all these services and applications become interoperable in principle. IP, TCP and UDP in the context of the Internet are the typical ‘narrow waist’ elements on which several protocols such as HTTP rely and HTML and XML allow visualisation of all this in a browser environment. We will futher develop a first vision on ‘the minimal Data Fairport scope’ analogous to this model 1. A different perspective on the FAIRPORT final aim is a fully interoperable ‘Internet of Data’.

1 This will be discussed and decided upon in a separate ‘scope consensus phase’

Whether data producers or owners format data according to the ‘minimal Data Fairport scope’ behind or outside their firewall is entirely their own decision, but as soon as they are compliant to the ‘minimal Data Fairport scope’ all services can be run on them, either in combination with public data or not and either inside or outside firewalled analytics environments. The latter decision is not the realm of the international FAIRPORT network and firewalled services can be offered by anyone, public or private. For a first impression of a skunk implementation of a FAIRPORT service, please see the html mock up in the dropbox. Several ‘minimal Data Fairport scope’-­compliant service providers already exist and these will be approached to work with the Technical Task Force of the FAIRPORT initiative to further streamline their services to become the first ‘minimal Data Fairport scope’ compliant services and may be demonstrated in September 2014 along with dedicated exemplars.

4 Access & Funding Models

The access and funding model of the DATA FAIRPORT can only be fully determined once the scope of the FAIRPORT is defined in detail. The following section lists the elements that can make up such as business model and outlines the key principles that could be applied.

Some of the key considerations of the access & funding models include:

  • There may be ‘common utility’ aspects to the ‘minimal waist services’ of the DATA FAIRPORT that will require an element of public funding to maintain some of the core functions such as a central team or secretariat involved in the conventions and standards side of things.
  • A ‘mixed’ funding model, where public funding is augmented with usage or subscription based funding for specific services is a likely scenario that could develop. In any case the ‘slim waist services’ should serve the needs of many and should be provided as cost effectively as possible without a need for profit.
  • A special consideration will be required for private, commercial and privacy sensitive data. This will require a level of security, associated functionality and cost that should be considered but also implemented separately from the (much lighter) requirements for public data.
  • The development of the ‘slim waist’ should therefore consider the needs, also at the most elementary level of requirements for private and privacy sensitive data; analogous to the secure http (https) protocol.

The following potential elements of the funding model were discussed

Service type Potential value Key considerations & costs
Manage/Drive conventions (under)standards
  • Value at institute and policy maker level – is mainly a ‘common utility’ type of value
  • Requires an initial ‘start-­up’ investment and continuing funding
Requires a permanent ‘secretariat’ to manage – 10 to 20 fte – at least 700K per year.
Cross ref service for Data (consider cooperation with Cross Ref, get institute backing)
  • Identity for a data set – 1 Euro per ID
  • Storage of data set descriptors – 100 Euro’s per dataset (including versioning)
Requires a dedicated organisation and infrastructure to manage – 2M per year
Data Hosting / data stewardship compliancy / data ‘life insurance’ ‘Pay as you go’ for:
  • Standard data stewardship compliancy for researchers
  • Registering provenance, statistics, qc, review services
  • Life insurance hosting / data longevity
  • Easy replication of data sets
  • Advertisement of data
  • Needs endorsement by ELIXIR and NCOs
  • Providers to commit to align with fairport
  • Potential freemium service payed by countries/funders?
Marketplace ‘on top of’ the DATA FAIRPORT A host of service that use the DATA FAIRPORT as its backbone where it would receive a ‘revenue share’
  • Will require the DATA FAIRPORT to provide service levels
  • A host of possible services

5 Governance, Organisation & Operations

The two-pronged scope of the FAIRPORT initiative (the ‘minimal Data Fairport scope’ and associated services, mostly built by expert partners) requires a very special governance and organisation. In the preparatory phase models that work in comparable situations will be studied and proposed for this initiative.

The specific role of ELIXIR, NCO and SCIELO in such an environment was defined as a ‘shepherd’ of the ‘minimal Data Fairport scope’ with minimal infrastructure to enable (re)creation, storage, publishing and (re)use of interoperable data. The actual data storage and steward organizations with their own coordinated but distributed hard infrastructure will take prime responsibility of the actual data archiving and controlled access where needed. Thus, the actual data storage, exchange and re-use of data will largely take place in the infrastructures and networks of the ELIXIR nodes and beyond Europe, as eScience is an inherently global playing field. ELIXIR, NCO and SCIELO should therefore help ensure that the ‘minimal Data Fairport scope’ will be ‘just in time’ and ‘just enough’ for their constituencies and partners to enable optimal interoperability of data (and obviously the associated software, ontologies and standards) and to actively encourage the use of these protocols. The funders and (data)publishers involved are very likely to help refine and adopt these standards as this would make their data resources optimally interoperable in the knowledge discovery arena. The major elements defined at the meeting were that ELIXIR (and its represented sister organizations outside Europe) should (a) gather, nurture and endorse the minimal standards and protocols required for effective data management and long-‐term data interoperability (the ‘minimal Data Fairport scope’) (b) emphasize use and sustainability of the standards and the protocols and (c) encourage and organize training of data experts to increasingly bring routine implementation of best practices.

6 The next steps: a preparatory phase of 9 months

At the final day of the meeting a general presentation was given to a wider audience. About 45 people attended this session. These included several representatives of policy circles and funders including the Europea Commission, IMI, NGI, BBSRC, the RDA, DANS, the Dutch Ministry (Appendix IV).

These participants were asked to give their fresh perspective on the outcome of the conference and the meeting ended with a round of statements and in some cases already verbal commitments to the next phase of the FAIRPORT initiative. The outcome of this session was very encouraging and led to the following decisions about follow up. All participants of the unconference committed in this broader group to spend time and or funds in a preparatory phase.

Three work packages (WP) were identified and these will be conducted in parallel for an intitial period of 8 months ending at the 4th plenary of the RDA in September 2014 where the more detailed plans for a FAIRPORT and the results of this preparatory phase will be presented.

These work packages are:

1. Writing the more detailed overall implementation plan, including a funding proposition.

2. Active advocacy to all enablers and other stakeholders that were not present at the meeting.

3. An ad hoc Technical Task Force to coordinate and perform the implementation of exemplar services compliant to the ‘minimal Data Fairport scope’ to be demonstrated in September.

These three work packages will act in close collaboration and a small coordination team will be formed to ensure optimal synergy, planning and performance of the widely distributed groups, based on the proven mixed/distributed management model of NBIC, Open PHACTS and the Netherlands eScience center. All conference participants have committed to participate at various levels to minimally one work package. The meeting was followed by a survey to make a more detailed inventory of various contributions and the pre-­existing social networks, technical assets and funding possibilities of all partners and the results will be used to design the Plan of Approach for the 8 months following the meeting. Obviously, the working groups running the various work streams will be open to important partners that were not present at the meeting, but these additional people will be invited by the Working Package Leaders.

After the first round of reactions to this report and the analysis of the responses to the survey, we will appoint the WPL’s and the overall coordinating team, hopefully by full consensus. DTL in Netherlands has offered to host the small coordination team and to provide office space and facilities for the first 8 month preparatory phase. We will actively seek similar initial office and coordination facilities in the USA and in Latin America to start with. The funding needs for the initial phase (coordination team, writing TF, Advocacy TF and Technical TF) will be determined in the three work packages.

Appendix I – Elevator Pitches

Elevator Pitch 1

• Science data is increasingly open but standards, formats, and locations are fragmented and difficult to find
• FAIRPORT will simplify long term handling of research data by establishing a community convention
• This lets people write services such that data can be found and reused trusting that the underlying standards will be used by
others and stay around for a long time.
• The cost, effort and time for handling research data will drop dramatically

Elevator Pitch 2

FAIRPORT aims to accelerate the pace of scientific discovery by providing a seamless modern information architecture for efficient
data sharing and re-‐use, so we can move information and data to scientists at the speed of discovery and find cures faster.

Elevator Pitch 3

• Scientific discovery can now be dramatically speeded up in novel, important ways by utilizing much more scientific data. This
will result is better medical diagnoses and faster cures
• But today, data are often incompatible
• Thus, to realize the goal of enhanced scientific discovery and better medical outcomes, FAIRPORT will promote the "interoperability of heterogeneous data" to enable computer applications to utilize all potentially relevant data
• This is a technological and social task

Elevator Pitch 4

FAIRPORT aims to be a google maps for research data. Users can look at different layers depending on their needs. Others
will be able to offer services like search and analysis on top this system. Quickly navigating existing data will mean efficient use
of research funding in data-­‐intensive science and it will speed new research.

Elevator Pitch 5

• Despite massive investments in research, most research data and tools become inaccessible with time, and even if available, they often cannot easily be reused for new discoveries.
• FAIRPORT provides a plug-­‐and-­‐play environment in which researchers can share their research data and tools, using existing
community standards.
• FAIRPORT ensures the long-­term preservation of results, promotes scientific integrity and reproducibility, and accelerates the
pace of scientific discovery.

Appendix II – Initial list of DATA FAIRPORT use cases

1. Identifiable data sets
2. Citable; linkable; reference to data file/source system
3. ‘Model’ that defines that data set (metadata)
4. Persistent identifier (citeable)
5. Discoverable representation (data provenance)
6. Data items/relations that are computable
7. Data sets that can be validated against ‘model’
8. Value added services on top
9. Sharing of standard ‘models’ (templates)
10. ELIXIR like systems to endorse (my data is ‘proper’)
11. Access to controlled language (with editable functions)
12. Structured  representations of research procedures - develop meta data standards
13. Evaluate compliance of meta data submission
14. Mapping of differing accession codes
15. Data transformation services (how to ensure interoperability for meta studies)
16. Define ‘models’
17. Promote the convention
18. A possible minimal FAIRPORT Convention interface
19. Git for research data & tools (Note: not GitHub). Minimal elements:

  • packaged research objects (identification)
  • Services for annotating them
  • Services for recording provenance
  • Services for for all of reading & writing all these
  • With write-­‐limitations for ownership, integrity
  • With read-­‐limitations for privacy
  • Services for logging & tracing facilities

20. Existing solutions and examples: DOI, W3C OpenAnnotation Model, Existing vocabularies & vocab. Languages,
Researchobject.org. Partial examples: GitHub,FigShare,DataCite,Dryad,OliveArchive
21. Versioning, snapshotting, dependency-­‐tracking of data sets (github-­‐style)
22. How to move computation to the data (embassy clouds, sandboxes)
23. Discussion: guaranteeing security & quality: should this be part of the Convention, or left to different implemenations?

Appendix III – Initial list of DATA FAIRPORT detailed requirements

1. ensure meta data is captured to standards
2. have citable publication of the data
3. enable to show that data is used
4. Allow for bootstrapping start functionality:
5. My data is in container X including a bar code on content
6. It can be found and recognised and, if needed, transported
7. Any functionality from user demand can be stacked and provided by a specialist providing systematic data handling compliance
8. Awareness of what is available datawise
9. Identifying relevant data (sets) and Providing information on data (sets); catalog of datasets
10. Providing access to datasets with security and levels of access controls
11. Conversion of data (sets) into interoperable format
12. Speeding up data analyses processes
13. Define value returned to the submitter as well as to the community
14. Could take form of certain kinds of similarity search (“you may be interested in these other similar datasets”), statistics, &
other computations.
15. Also: citability, funding body impact scores, publishability.
16. Reciprocity of use…
17. Core “bibliographic type” metadata: think of “PubMed for data
18. Cross reference to WHERE the data resides
19. Default data storage where there is not specialist site to store
20. Citable persistent HDL/DOI…
21. Domain-­‐specific metadata plugins e.g. MIAME for arrays, etc
22. Availability metadata – e.g. public/restricted?
23. Optional (but important for our use case) Dataset-­‐specific metadata

Slides

Source: http://www.slideshare.net/SN-CM/big-data-14647516# (PPT) My Note: Includes Transcript

Slide 1. A curse of interdisciplinarity

A challenge in the other discipline always seems ‘easy’ because we are not hindered by knowledge’.Barend Mons(DTL-DISC/ELIXIR)NBIC, LUMC. 1

BarendMons10032012Slide1.PNG

Slide 2. Dutch Techcentre For Life Science

PPP10/09/12 2

http://www.dtls.nl/dtl/

BarendMons10032012Slide2.PNG

Slide 3. ELIXIR

Safeguarding the results of life science research in Europe European Life Sciences Infrastructure for Biological Information http://www.elixir-europe.org

BarendMons10032012Slide3.PNG

Slide 4. DISC: the connected data departments of DTL research Hotels

DISC*technologyfacilitiestechnologyresearcheducation DTL& training *) DISC = DTL Data Integration & Stewardship Centre

BarendMons10032012Slide4.PNG

Slide 5. What is bioinformatics?

• The science of storing, retrieving and analysing large amounts of biological information• An interdisciplinary science involving biologists, biochemists, computer scientists and mathematicians• At the heart of modern biology 5

BarendMons10032012Slide5.PNG

Slide 6. Bioinformatics underpins life-science research

11Genomes GenomesContain genes Contain genes 22Genes are Genes are transcribed transcribed 33Transcripts translate Transcripts translate to protein sequences to protein sequences 44Proteins form three- Proteins form three- dimensional structures dimensional structures 55Proteins interact with each other Proteins interact with each other and with small molecules to form and with small molecules to form pathways pathways 6 Pathways combine 6 Pathways combine to build systems to build systems 6

BarendMons10032012Slide6.PNG

Slide 7. Life Science data: Multi-omics, multi-technology, multi organism, multi dimensional

BarendMons10032012Slide7.PNG

Slide 8. From molecules to medicine

Molecular components Integration Translation Genomes Human populations Nucleotides Biobanks Tissues and organs Transcripts Complexes Therapies Proteins Disease prevention Domains Pathways Cells Human Early individuals DiagnosisStructures Small molecules 8

BarendMons10032012Slide8.PNG

Slide 9. What is ELIXIR?

• An ESFRI research infrastructure of global significance• Unites Europe’s leading life science organisations in managing and safeguarding the vast amounts of data being generated every day by publicly funded research.• A large-scale initiative that will provide the facilities necessary for Europe’s life-science researchers to make the most of our rapidly growing store of information about living systems, which is the foundation on which our understanding of life is built. 9

BarendMons10032012Slide9.PNG

Slide 10. Why ELIXIR?

• Creating a robust infrastructure for biological information is a bigger task than EMBL-EBI – or any individual organisation or nation – can take on alone.• Biology has by far the largest research community: • ~3 million life science researchers in Europe • >6 million web hits a day at EMBL-EBI alone• We need to involve other European partners 10

BarendMons10032012Slide10.PNG

Slide 11. The challenge

• Computer speed and storage capacity is doubling every 18 months and this rate is steady• DNA sequence data is doubling every 6- 8 months over the last 3 years and looks to continue for Guy Cochrane, ENA, EMBL-EBI this decade 11

BarendMons10032012Slide11.PNG

Slide 12. Europe has already paid for the science

Annual cost of generating new protein structure data in labs around the world Annual cost of maintaining the data in a central database 12

BarendMons10032012Slide12.PNG

Slide 13. ELIXIR’s mission

To build a sustainableEuropean infrastructure forbiological information,supporting life scienceresearch and its medicinetranslation to: environment bioindustries society 13

BarendMons10032012Slide13.PNG

Slide 14. A distributed pan-European infrastructure

14

BarendMons10032012Slide14.PNG
 

Slide15. Benefits

ELIXIR will contribute to European innovation by:• Optimising access and exploitation of life-science data• Ensuring longevity of the data, thereby protecting investments already made in research• Enhancing the quality of European research by supporting national efforts to increase the competence and number of bioinformatics users through training• Strengthening the global position and influence of Europe in life-science research in both in academia and industry 15

BarendMons10032012Slide15.PNG

Slide 16. The scientific reason for ELIXIR

• Data is an essential commodity for life-science research.• Ten years ago, finding the connection between a gene and a characteristic (e.g. drought tolerance, risk of heart disease) could take years; now it takes minutes. Image courtesy of Genome Research Ltd.• Data analysis is now the bottleneck in life-science research• ELIXIR is our only realistic hope of easing that bottleneck 16

BarendMons10032012Slide16.PNG

Slide 17. Commentary

BarendMons10032012Slide17.PNG

Slide 18. One societal reason for ELIXIR

• The era of personal genome sequencing is upon us.• Sequence data will not cross national boundaries.• Every national health system will need expertise to interpret it and treat patients accordingly.• Individuals need to be sure that their personal biological data are in safe hands. 18

BarendMons10032012Slide18.PNG

Slide 19. The financial reason for ELIXIR

• Europe has already spent the money to generate the data.• It will waste all this investment in research if the future of the data is not secured.• Industry, from SMEs to big multinationals, needs access to public data to analyse its proprietary data. 19

BarendMons10032012Slide19.PNG

Slide 20. Maintaining open access

• Open access to life science is essential for advances in many areas of research• Open access to bioinformatics resources provides a valuable path to discovery, one that in many other areas of research is limited by commercial confidentiality Mark Forster, Syngenta,• Charging for that data, or seeking to restrict member of the EMBL-EBI Industry Programme access through exercising Intellectual Property (IP) rights, would impede progress• ELIXIR will guarantee that open access to biological data is maintained. Speaking with a single voice will strengthen Europe’s influence in such global discussions. 20

BarendMons10032012Slide20.PNG

Slide 21. 13 ELIXIR Countries

21

BarendMons10032012Slide21.PNG

Slide 22. Part two >>>> eScience in LS

• The way we dicover knowledge has changed fundamentally over just a decade. BIGNORANC E10/09/12 22

BarendMons10032012Slide22.PNG

Slide 23. The Data Deluge

The general challenge: Data has far outgrown institutional handling capacity is everywhere The Issue: But Life Sciences is particularly challenged and complex. More and more We write ‘about datasets’ ….The amount of digital data is That are too large to publish exploding, with a staggering 1.8 zettabytes in 2011 In narrative

BarendMons10032012Slide23.PNG

Slide 24. Nanopublications & Cardinal Assertions

Nanopublication A Nanopublication is the smallest unit of publishable information containing: 1.Assertion A statement of concepts in terms of one or more ‘subject -> predicate -> object’ (triple) relationships. 1.Provenance a)Attribution – Who made this assertion,1 ‘n’ when and where?identical different b)Supporting information – Any otherassertion provenances information which is relevant to the assertion (e.g. this assertion is only valid in humans under 18). A Cardinal Assertion aggregates all ‘n’ Nanopublications making the same assertion. It therefore has 1 assertion and ‘n’ provenances, eliminating redundancy. Cardinal Assertion

BarendMons10032012Slide24.PNG

Slide 25. Under the hood……

BarendMons10032012Slide25.PNG

Slide 26. Managing volume & complexity

Combining Cardinal Assertions with 5 5Concept profiles reduces the amount ofdata with ≈99.999996% 4 4 1 1Individual 2 2Concept Profiles≈4x106IndividualCardinal Assertions 5 4 2 1> 10 11IndividualNanopublications> 1014

BarendMons10032012Slide26.PNG

Slide 27. The LS concept web: 2x2x106 concepts (profiles)

Slide 27. A dynamic Concept Web versus a static Ontology 28

BarendMons10032012Slide27.PNG

Slide 28. Genes related to Cystic fibrosis

BarendMons10032012Slide28.PNG

Slide 29. Scatter  Plot

= Known reference pairs = non-co-occurrence pairs More mutual informationNo increase in concept overlap Including manual curation More concepts in common Removal of low info paths

BarendMons10032012Slide29.PNG

Slide 30. Network Graph

BarendMons10032012Slide30.PNG

Slide 31. eScience…. in silico reasoning and in cerebro validation

Expert Skype calls Reading up

BarendMons10032012Slide31.PNG

Slide 32. Organisation of the ecosystem

Global Authority Nanopublishers App & Service Users Providers Endorse CA Space Application Knowledge (OCS & ICS) development Management Providers Reasoning services Practices Academic & Best ONS/INSs technical and Commercial process Users consultancy project Knowledge Original delivery Discovery Assist & Data Owners capacity Certify

BarendMons10032012Slide32.PNG

Slide 33. ORCID VIVO

33

BarendMons10032012Slide33.PNG

Slide 34. IN ANY CASE

: regardless of how ‘sensitive’ your data is, it is malpractice to: - Generate data without a solid stewardship plan - Build impenetrable SILOS - Fail to record provenance - Store them in non interoperable format - Think that data=information -EVEN if your only goal is the Nobel Prize (or for Dutch: a Spinoza Prize)34

BarendMons10032012Slide34.PNG

Slide 35. Acceptance of Semantic Web Approach

Over the last decade, academicresearch organisations developednew methodologies and tools toaddress the Big Data problem.Global agreement by leadingscientists on uniqueNanopublication solution.100’s of millions already investedin the basis technologyApplicable as a technology across(STM) domains and industries.Pharmaceutical companies areearly adopters (InnovativeMedicine Initiative).

BarendMons10032012Slide35.PNG

Slide 36. Acknowledging

The ‘Dutch Team’… • Herman van Haagen , MsC. (LUMC) • Dr. Peter Bram ‘t Hoen (LUMC) CWA- Open PHACTS • Dr. Marco Roos (LUMC) • Prof. Amos Bairoch (SIB, Switzerland, CWA) • Dr. Erik Schultes (LUMC) • Prof. Carole Goble (Mancheste, CWA, OPS) • Prof. Johan den Dunnen (LUMC) • Prof. Katy Borner (Indiana University CWA) • Prof. Gertjan van Ommen (LUMC) • Prof. Mark Musen (NCBO, Stanford CWA,OPS) • Dr. Erik van Mulligen (EMC) • Dr. Pascale Gaudet (UniProt, ISB, CWA • Dr. Jan Kors (EMC) • Dr. Mike Colon (VIVO, UF, CWA) • Dr. Martijn Schuemie (EMC) • Prof. Maryann Martone (Force 11, USC, CWA) • Prof. Johan van der Lei (EMC) • Dr. Nigam Shah (NCBO, Stanford, CWA, OPS) • Dr. Rob Hooft (NBIC) • Dr. Mark Wlikinson (Canada, CWA) • Dr. Christine Chichester (NBIC) • Abel Packer (Brazil, Scielo, CWA, OPS) • Dr. Leon Mei (NBIC) • Jan Velterop (ACKnowledge, CWA, OPS) • Kees Burger (NBIC) • Albert Mons (CWA, NBIC) • Bharat Singh (NBIC/EMC) • Prof. Frank van Harnelen (FUA/LARKC, CWA, OPS) • Dr. Marc van Driel (NBIC) • Dr. Chris Evelo (Maastrciht, CWA, OPS) • Dr. Ruben Kok (NBIC) • Dr. Antony Willams (RSC/ChemSpider, CWA,OPS) • Prof. Marcel Reinders (NBIC) • Dr. Richard Kidd (RSC, OPS) • Prof. Jaap Heringa (NBIC) • Dr. Paul Groth (FUA, CWA, OPS) • Prof. Gert Vriend (NBIC) • Dr. Michel Dumontier (Canada, CWA, OPS) • Dr. Morris Schwertz (BBMRI, CWA) • Dr .Andrew Gibson, UA, CWA, OPS) • Dr. Andra Waagmeester (NBIC) • Dr. Bryn Williams-Jones (Pfizer, OPS) • Dr. Kristina Hettne (LUMC) • Dr. Ian Dix (Astra Zeneca, OPS) • Dr. Rene van Schaik (eScience Cenrte) • Dr. Niklas Blomberg (Astra Zeneca, OPS) • Drs. Albert Mons (PHORTOS consultants) • Dr. Mike Barnes, GSK, OPS) • Mr. Drs. Arie Baak (PHORTOS consultants) • Prof. Jan-erik Litton (CWA, BBMRI)

BarendMons10032012Slide36.PNG

Spotfire Dashboard

For Internet Explorer Users and Those Wanting Full Screen Display Use: Web Player Get Spotfire for iPad App

Research Notes

Data FairPort Files Excerpts

2010 VIVO National Conference
Scientific Sessions
Friday, August 13, 2010
Presentation: "The Next Step in Knowledge Evolution; Colonization of Brains..." (Video)
Speaker: Barend Mons, Netherlands Bioinformatics Centre (NBIC)

Skunkworks are high-priority R&D projects in which a small team is taken out of their normal working environment and given exceptional freedom from their organisation's standard management constraints.

SKUNKPORT is a rapid prototype for a revolutionary infrastructure to foster a competitive infrastructure for preserving, discovering and reusing research data and tools.  

Each FAIRPORT will be customizable to serve that organization’s needs (think Drupal). FAIRPORTS may personalize the user experience by implementing their own user interface and by developing or selecting fairport-compliant services of interest. All FAIRPORTs will have the capability to share their metadata concerning data and services.

http://www.cnri.reston.va.us/doa.html

What does one individual FAIRPORT do minimally?

SkunkPort
use a small pot of money to create an initial, small scouting team;  
skunkport : identify a use case, develop and early functional spec, develop a prototype that wires existing resources together
answer questions from the community; pathfinders group

What are you doing?
...
...
...
Why are you doing it?
...
...
...
What will happen? What will people experience?
...
...
...
How will it help? What will be the benefits?

FAIRPORT establishes a convention for community research data. This lets people write services such that data can be found and reused trusting that the underlying standards will be used by others and stayaround for a long time…

Data retrieval / storage (making it discoverable for individual scientist

George Strawn Elevator Pitch

Scientific discovery can now be dramatically speeded up in novel, important ways by utilizing much more scientific data.  This will result is better medical diagnoses and faster cures
But today, data the are often incompatible
Thus, to realize the goal of enhanced scientific discovery and better medical outcomes, fairport will promote the "interoperability of heterogeneous data" to enable computer applications to utilize all potentially relevant data

Ideal Outcome – What is the outcome that should be achieved? 
Already in place – Outline what can (and should) be (re)used.
Key Gaps – What are the key gaps between the ideal outcome and the present situation
Barriers & Risks – What key barriers & risks exist for closing the gaps and achieving the ideal outcome and how should they be mitigated (i.e. turned into enablers)
Key Actions – What are the key actions that should be undertaken taking into account the above

1. It may be useful to look at the BSI Kitemark to draw inspiration for Fairport services:

http://www.bsigroup.com/en-GB/our-se...oose-kitemark/

2. http://mozillascience.org/code-as-a-...a-new-project/

Scientific reproducibility and big data

Meeting of the President’s Council of Advisors on Science and Technology (PCAST)

DATE:  Friday, January 31, 2014

TIME:  9:00 a.m. until approximately 12:00 p.m.

LOCATION:  National Academy of Sciences, Lecture Room

                        2101 Constitution Avenue, NW, Washington, DC (Nearest Metro station: Foggy Bottom)

PRIMARY TOPICS:

  • Scientific reproducibility and big data
  • Challenges and opportunities in science and technology at the Department of Commerce

KEY SPEAKERS:

  • Glenn Begley, Chief Scientific Officer and Senior Vice-President R&D, TetraLogic Pharmaceuticals
  • Donald Berry, Professor, Department of Biostatistics, University of Texas MD Anderson Cancer Center
  • Dan MacArthur, Group Leader, Analytic and Translational Genetics Unit, Massachusetts General Hospital, and Assistant Professor, Harvard Medical School
  • Marcia McNutt, Editor-In-Chief, Science
  • Philip Campbell, Editor-in-Chief, Nature and Nature Publishing Group
  • Veronique Kiermer, Executive Editor and Head of Researchers Services, Nature Publishing Group
  • Patrick Gallagher, Director, National Institute of Standards and Technology and Under Secretary of Commerce for Standards and Technology, US Department of Commerce  
  • Lawrence Strickling, Assistant Secretary for Communications and Information and Administrator, National Telecommunications and Information Administration, US Department of Commerce

REGISTRATION: To attend in person, please register online, here.


WEBCAST: This event will be live-streamed on the web, here. No registration is required to watch the live-stream.

For full agenda (subject to change), visit: http://www.whitehouse.gov/ostp/pcast/meetings.

For more information, visit: http://www.whitehouse.gov/ostp.

President’s Council of Advisors on Science and Technology (PCAST)
Source: http://www.whitehouse.gov/sites/defa...14_updated.pdf
Source: http://www.whitehouse.gov/administra...eetings/future

pcast@ostp.gov

Public Meeting Agenda
January 31, 2014
National Academy of Sciences (NAS)
2101 Constitution Avenue, NW
Washington, DC
Lecture Room

9:00 am Welcome from PCAST Co-Chairs

  • John Holdren, Assistant to the President for Science and Technology; Director, Office of Science and Technology Policy (OSTP); Co-Chair, PCAST
  • Eric Lander, Co-Chair, PCAST

9:05 am Improving Scientific Reproducibility in an Age of International Competition and Big Data I: Researchers

  • Glenn Begley, Chief Scientific Officer and Senior Vice-President R&D, TetraLogic Pharmaceuticals
  • Donald Berry, Professor, Department of Biostatistics, University of Texas MD Anderson Cancer Center
  • Daniel MacArthur, Assistant Professor, Harvard Medical School and Massachusetts General Hospital and Associate Member, Broad Institute of Harvard and MIT

10:00 am Improving Scientific Reproducibility in an Age of International Competition and Big Data II: Editors

  • Marcia McNutt, Editor-In-Chief, Science
  • Philip Campbell, Editor-In-Chief, Nature and Nature Publishing Group
  • Véronique Kiermer, Executive Editor and Head of Researchers Services, Nature Publishing Group

10:45 am Break

11:00 am Challenges for the 21st Century Enterprise: Leveraging S&T Across the Department of Commerce

  • Patrick Gallagher, Director, National Institute of Standards and Technology (NIST) and Under Secretary of Commerce for Standards and Technology
  • Lawrence Strickling, Assistant Secretary for Communications and Information and Administrator, National Telecommunications and Information Administration (NTIA), US Department of Commerce

11:45 am Public Comment

12:00 pm Adjourn

Systematic identification of pharmacogenomics information from clinical trials

J Biomed Inform. 2012 Oct;45(5):870-8. doi: 10.1016/j.jbi.2012.04.005. Epub 2012 Apr 24.

Source: http://www.ncbi.nlm.nih.gov/pubmed/22546622

Author information

National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

Abstract

Recent progress in high-throughput genomic technologies has shifted pharmacogenomic research from candidate gene pharmacogenetics to clinical pharmacogenomics (PGx). Many clinical related questions may be asked such as 'what drug should be prescribed for a patient with mutant alleles?' Typically, answers to such questions can be found in publications mentioning the relationships of the gene-drug-disease of interest. In this work, we hypothesize that ClinicalTrials.gov is a comparable source rich in PGx related information. In this regard, we developed a systematic approach to automatically identify PGx relationships between genes, drugs and diseases from trial records in ClinicalTrials.gov. In our evaluation, we found that our extracted relationships overlap significantly with the curated factual knowledge through the literature in a PGx database and that most relationships appear on average 5 years earlier in clinical trials than in their corresponding publications, suggesting that clinical trials may be valuable for both validating known and capturing new PGx related information in a more timely manner. Furthermore, two human reviewers judged a portion of computer-generated relationships and found an overall accuracy of 74% for our text-mining approach. This work has practical implications in enriching our existing knowledge on PGx gene-drug-disease relationships as well as suggesting crosslinks between ClinicalTrials.gov and other PGx knowledge bases.

Published by Elsevier Inc.

PMID: 22546622 [PubMed - indexed for MEDLINE] PMCIDPMC3760158

Images from this publication

See all images (5)Free text

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

LinkOut - more resources

NIH Public Access

Source: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3760158/

J Biomed Inform. Author manuscript; available in PMC 2013 October 1.
 
Published in final edited form as:
PMCID: PMC3760158
NIHMSID: NIHMS372549

Systematic identification of pharmacogenomics information from clinical trials

Abstract

Recent progress in high-throughput genomic technologies has shifted pharmacogenomic research from candidate gene pharmacogenetics to clinical pharmacogenomics (PGx). Many clinical related questions may be asked such as ‘what drug should be prescribed for a patient with mutant alleles?’ Typically, answers to such questions can be found in publications mentioning the relationships of the gene–drug–disease of interest. In this work, we hypothesize that ClinicalTrials.gov is a comparable source rich in PGx related information. In this regard, we developed a systematic approach to automatically identify PGx relationships between genes, drugs and diseases from trial records in ClinicalTrials.gov. In our evaluation, we found that our extracted relationships overlap significantly with the curated factual knowledge through the literature in a PGx database and that most relationships appear on average 5 years earlier in clinical trials than in their corresponding publications, suggesting that clinical trials may be valuable for both validating known and capturing new PGx related information in a more timely manner. Furthermore, two human reviewers judged a portion of computer-generated relationships and found an overall accuracy of 74% for our text-mining approach. This work has practical implications in enriching our existing knowledge on PGx gene–drug–disease relationships as well as suggesting crosslinks between ClinicalTrials.gov and other PGx knowledge bases.

 
Keywords: Text mining, Clinical outcome, Pharmacogenomics, Clinical trial

1. Introduction

Clinical outcomes in response to drugs can be significantly different among individuals, in terms of treatment efficacy and drug toxicity. Although many clinical variables of individuals (e.g., liver function, disease severity, and drug interactions) potentially cause the variability of drug effects, it is now recognized that genetic polymorphisms can have an even greater influence on drug efficacy and safety [1]. Pharmacogenomics (PGx) studies elucidate the inherited nature of variability of drug effects in the context of genomics. Recent progress in high-throughput genomic technologies has significantly enhanced the identification of genetic variations associated with drug absorption, distribution, metabolism, excretion, and target action. The consequent explosion of data has raised challenges in PGx data description, storage, and integration. Meanwhile, pharmacogenomics impacts at many stages along the drug discovery and development pipeline, from target identification in early-stage research to post-marketing surveillance in phase IV clinical trials [2]. The consequent diversity of data types has raised challenges to capture more attributes of genotype and phenotype data as well as more complex relationships between them.

PharmGKB [3], the pharmacogenomics knowledge base, is a widely used resource in PGx studies. It collects PGx related genotype and phenotype data with manually annotated relationships between genes, polymorphisms, drugs, and diseases, as well as provides summarized information on important PGx genes and drug pathways. Over the last decade, PharmGKB has collected and annotated PGx data from a variety of different sources but the scientific literature remains its major source [4]. PharmGKB has developed structures to tag and describe relationships with PGx categories: clinical outcome, pharmacodynamics and drug responses, pharmacokinetics, molecular and cellular functional assays, and genotype [5]. Recent PGx text mining efforts have mainly focused on automatically extracting these relationships from the scientific literature [4,6].

Meanwhile, PGx research has shifted from candidate gene pharmacogenetics to clinical pharmacogenomics [4]. To investigate the clinical applications of PGx studies clinical trials are designed and conducted. In the context of drug development, clinical trials are necessary steps to determine if a drug is safe and effective (see Figure. 1). It usually takes many years for a new drug to pass through Phase I, II, and III before it is approved by the national regulatory authority (e.g., the Food and Drug Administration in the United States). In each phase of clinical trials, studies are separately conducted and approved. In the past, researchers have investigated the persistent gap between the number of trials conducted and the number for which results are published. They found that up to 37% of clinical trials never resulted in a scientific publication, and that the published articles reporting trial outcomes may not be consistent with protocols [7]. Therefore, in order to promote the transparency of clinical trials, several policy recommendations and regulations have been created for trial registration when a trial is launched [8]. For instance, in the United States, the Food and Drug Administration Modernization Act [9] requires all trials for drugs for serious or life-threatening diseases and conditions be registered in ClinicalTrials.gov [10], a clinical trial registry data bank developed and maintained by the National Library of Medicine (NLM), part of the National Institutes of Health (NIH). As of August 2010, ClinicalTrials.gov, the largest of its kind in the world, contains over 100,000 trials from over 170 countries and is used by approximately 65,000 visitors each day. In ClinicalTrials.gov, each registered record provides information about a trial’s purpose, condition, intervention, detailed description, eligibility (who may participate), location, status, etc. Most of the above information is described in natural language. Figure 2 shows an example of a clinical trial record in ClinicalTrials.gov. Like other research databases [11], ClinicalTrials.gov captures important scientific and clinical investigations in biomedicine. As a result, the knowledge buried in those trial records has shown to be valuable for researchers, clinicians, and the pharmaceutical industry [12, 13, 14].

Figure 1

Clinical trial registration and publication along the drug development pipeline.

PMC3760158F1.gif

 

Figure 2

Snapshot for a clinical trial record in ClinicalTrials.gov.

PMC3760158F2.gif

 

In this study, we hypothesize that ClinicalTrials.gov is a comparably rich data source to the biomedical literature for PGx clinical outcome related information (i.e., how genes affect drug responses in patients with specific diseases). In this regard, we developed a text-mining approach to systematically recognize PGx relevant relationships between genes, drugs, and diseases from both trial record metadata and descriptions (in free text). It should be noted that many relationships mentioned in clinical trials may still be under investigation (i.e., not yet concluded). Despite this fact, they are not selected without cause or randomly. Rather, they are carefully designed and conducted based on the accumulated knowledge from preliminary studies [15,16]. Therefore, these relationships are reasonable candidates for potential inclusion in the databases of PGx studies. In addition, the speculative relationships themselves are valuable for pharmaco-surveillance and pharma-covigilance studies [17]. Moreover, mining PGx information from clinical trials as opposed to the scientific literature has one major advantage: detecting the important information in a more timely manner. That is, a relationship mentioned in a trial may not appear in the literature until several years later. This is expected because it takes time for a trial to be conducted, concluded, and published. To investigate this time lag issue further, we analyzed 8588 PubMed® citations that were manually linked to 7224 trials [18] in ClinicalTrials.gov and computed the time lags between the trials and their resulting publications. As shown in Figure 3a, we found that the average time difference between a trial’s start date and publication date is approximately 5 years. Once a trial is completed, majority of them (∼62%) have their results published within 2–3 years (see Figure 3b). For instance, to study how genetic polymorphisms influence the efficacy and side effect profiles of Paroxetine and Escitalopram for major depression treatments, a Phase IV clinical trial entitled ‘Clinical pharmacogenomics of antidepressant response’ was launched and registered in ClinicalTrials.gov in 2006 (See Figure 2; NCT number = NCT00384020). The trial was completed 4 years later in 2010 and shortly thereafter, the trial was published in a journal article entitled ‘Genetic polymorphisms of cytochrome P450 enzymes influence metabolism of the antidepressant escitalopram and treatment response’ [19].

Figure 3

Time lag between clinical trial and publication. (a) Time lag from trial start date to publication date. On average, a publication occurs 5 years after its corresponding trial starts. (b) Time lag from trial completion date to publication date (data to ...

PMC3760158F3.gif

Note that in Figure 3a there are articles that were published in the same year as their trials started (i.e., zero year difference between the trial start date and publication date). We looked into these cases and found that some of these articles are actually describing the study rationale and protocol rather than study results (e.g., article ‘PMID = 19828019’ and its corresponding trial ‘NCT0086251’). Also, some reported trial start dates in ClinicalTrials.gov are likely to be errors. For example, the article (PMID = 18761748) published in September 2008 is linked to a trial (NCT00147966) whose registered start date is June 2008, which is likely to be an error in this case.

2. Related work

In this work, we propose a text-mining approach to identify PGx relevant gene–drug–disease relationships from registered trial records. The related work includes manual curation of gene–drug–disease relationships in PharmGKB, text mining techniques for extracting PGx concepts and relationships, and other text-mining applications to clinical trial records.

2.1. Curated gene–drug–disease relationships in PharmGKB

In PharmGKB, the gene–drug–disease relationships are identified based on human curation and further classified into one of the five general PGx categories: clinical outcome (CO), pharmacodynamics and drug responses (PD), pharmacokinetics (PK), molecular and cellular functional assays (FA), and genotype (GN). The data in the clinical outcome category demonstrates that the genetic variability in the context of a drug effect significantly changes medical outcomes. For example, the gene ‘TYMS’, drug ‘methotrexate’, and disease ‘precursor cell lymphoblastic leukemia–lymphoma’ were identified for curation in PharmGKB based on a relevant article ‘Polymorphism of the thymidylate synthase gene and outcome of acute lymphoblastic leukaemia’ [20]. Subsequently, the above gene–drug–disease relationship was classified into the clinical outcome category. As of August 2010, PharmGKB covers 1621 such gene–drug–disease relationships categorized as clinical outcome.

2.2. Text mining techniques for extracting PGx concepts and relationships

Concept identification serves as a prerequisite for many subsequent tasks of biomedical text mining like relationship extraction [21]. In PGx studies, the key concepts include gene, gene variant, drug and disease. Text mining tools have been developed for identifying these concepts such as GAPSCORE for identifying genes from PGx articles [22] and MutationFinder for identifying gene variants [23]. The relationships between identified PGx concepts can be nontyped (e.g., relationship between gene ‘TYMS’, drug ‘methotrexate’, and disease ‘precursor cell lymphoblastic leukemia–lymphoma’ is discussed in article PMID = 11937185) or specific (e.g., gene ‘TYMS’ variants affect the clinical outcome of ‘precursor cell lymphoblastic leukemia–lymphoma’ patient treated with ‘methotrexate’). Some attempts have been made for PGx relationship extraction. For example, Garten and Altman developed an ontology-based tool, Pharmspresso, for extracting PGx information from full text articles by identifying concepts (such as genes, drugs, polymorphisms, and diseases) and relationships (such as action, association and comparison) [24]. Ahlers et al. developed a rulebased method for extracting specific PGx relationships such as ‘genetic etiology’ and ‘pharmacological effects’ from PubMed abstracts [25]. Theobald et al. computed conditional probabilities of PGx relationships between drugs, diseases, and genes by analyzing their co-occurrences in PubMed abstracts [26]. Coulet et al. developed a method to identify PGx relationships using syntactic rules and to organize these relationships in an ontology that maps diverse sentence structures and vocabularies to common semantics [27].

Research on applying text mining techniques in PGx studies is gaining attention and has achieved significant improvement in the recent years. A review of text-mining progress in PGx information extraction can be found in [6]. Recent workshops devoted to this domain were held in the Pacific Symposium on Biocomputing, where the 2010 and 2011 workshop themes were respectively ‘extraction of genotype–phenotype–drug relationships form texts: from entity recognition to bioinformatics application’ [28] and ‘mining the pharmacogenomics literature’ [29].

2.3. Other text-mining applications to clinical trial records

Clinical trials provide valuable information about the efficacy/toxicity of medical intervention. Text-mining techniques have been applied to published randomized clinical trial literature for extracting patient demographic information such as trial size and disease/symptom descriptors [13]. To enable semantic representation and search for clinical research eligibility criteria, some text mining studies have focused on extracting information from the narrative descriptions of eligibility criteria in trial records [3031].

At present, users can search for trials in ClinicalTrials.gov by entering keywords in the search box. The lack of unambiguous names for entities (e.g., intervention, condition, and gene) affects the retrieval of all relevant trials that meet users’ specifications. For example, more than 60% of the studies in ClinicalTrials.gov about heart attacks do not contain the phrase ‘heart attack’ but use a different term such as myocardial infarction [32]. To solve this issue, the embedded search engine of ClinicalTrials.govexpands user queries using synonyms derived from the Unified Medical Language System (UMLS) [33] and rank the retrieval results based on a probabilistic model [34]. This query expansion feature enables users to retrieve trials which use the term ‘myocardial infarction’ in the condition description as ‘heart attack’ related ones. However, it remains ‘myocardial infarction’ and ‘heart attack’ in the contexts of trial records not identified as the same disease concept. This makes ClinicalTrials.gov difficult to link to other related resources (e.g., PharmGKB). An attempt to use a standardized nomenclature for representing various clinical research eligibility entities was reported by Luo et al. [3536].

3. Methods

The goal of this study is to systematically identify clinical PGx information from clinical trial records. Figure 4 shows an overview of our workflow. We collected 93,661 clinical trial records from ClinicalTrials.gov as of August 2010. We first preprocessed these records and identified sections of interest. Second, we used a dictionary-based method to identify PGx concepts (i.e., diseases, drugs and genes) from the preprocessed trial records. Our gene–drug–disease relationship extraction is based on their co-occurrence in one trial record. Finally, we indexed the trial records with the identified PGx concepts. Hence, given a target PGx gene, our approach can return related diseases and drugs with corresponding trials. Similarly, given a target pair of PGx gene and drug, our approach can return trials in which the PGx pair is or was under investigation.

Figure 4

Workflow for mining clinical trial records.
PMC3760158F4.gif

3.1. Preprocessing clinical trial records

In ClinicalTrials.gov, each trial record is divided into sections, and each section is described in free-style texts (see Figure 2). The condition section includes information on the disease, disorder, syndrome, illness, or injury being studied in a trial. The intervention section includes information on the drug, vaccine, procedure, device, or other potential treatment being investigated in a trial. The study description section describes the study hypothesis, design, and all the information on trial intended for the lay public. These three sections were identified as important for this work and extracted for further PGx concept identification. Specifically, the condition section was used for disease identification, intervention section for drug, and study description section for gene.

3.2. Extracting gene–drug–disease relationships

We used a dictionary-based method to identify genes, drugs, and diseases from the preprocessed trial records. Three PharmGKB dictionaries were collected in August 2010, containing 3197 diseases, 2984 drugs, and 26,216 genes respectively. Each concept and its synonyms in the dictionary are assigned an internal PharmGKB identifier. For example, the drug concept ‘imatinib’ in PharmGKB, together with its list of synonyms ‘Gleevec’, ‘Glivec’, ‘Imatinib Mesylate’, and ‘Imatinib Methansulfonate’ are assigned a PharmGKB_Id = ‘PA10804’. Both name and synonyms were used for identifying PGx entities in trial records. The PGx concept identification and normalization facilitate PGx information retrieval from ClinicalTrials.gov. Furthermore, this makes the linking analysis between ClinicalTrials.gov and PharmGKB feasible.

For gene–drug–disease relationship extraction, we used a cooccurrence based method. We assume that there is a clinical outcome association between gene, drug and disease (i.e., how gene affects drug responses in patients with specific disease/condition) if they co-occur in the same trial record in ClinicalTrials.gov.

3.3. Indexing clinical trials

We systematically compiled the extracted PGx concepts and relationships with their identifiers linking to the corresponding trial records (see Figure 4). To facilitate the retrieval of PGx information from clinical trials, we built an index for the PGx concepts and trial records such that given a PGx gene, our approach first looks up the gene dictionary for its identifier, and then readily retrieves all the trials containing that gene identifier. Similarly, given a PGx gene–drug pair, our approach first looks up the gene and drug dictionary respectively for their identifiers, and then retrieves all the trials in which both identifiers are present.

3.4. Hypothesis testing and method evaluation

To test our hypothesis that ClinicalTrials.gov is a comparable source rich in PGx related information, we first compared our results (i.e., extracted PGx relationships between genes, drugs and diseases) in trial records against the ones found in PharmGKB and in PubMed, respectively.

Second, to assess the performance of our text-mining approach, we manually reviewed a subset of automatically extracted relationships. Specifically, two human annotators (JL and ZL) were asked to manually assign one of the following categories to 100 extracted gene–drug–disease relationships: the relationship was not mentioned in the trial record (Category I); the relationship was explicitly mentioned in the trial record with or without supporting publications (Categories II and III). When computing accuracy for our method, both Categories II and III were considered as correct extractions.

4. Results

4.1. Comparative evaluation of ClinicalTrials.gov

For contrasting ClinicalTrials.gov with PharmGKB and PubMed, we compared their coverage of 3-way gene–drug–disease PGx relationships, which were obtained based on the input of 26 PGx gene–drug pairs [37] from the PharmGKB website.

Given these 26 PGx gene–drug pairs, our approach was able to identify 348 clinical trials and 240 3-way PGx relationships. By querying the given PGx pairs in PubMed [38] while limiting the publication type to be ‘Clinical Trial’ and MeSH [39] (Medical Subject Headings) terms to be ‘Genetic Variation’ or ‘Genotype’, 1162 3-way relationships were retrieved in 448 PubMed citations. Finally, we found 261 such 3-way relationships curated in PharmGKB.

Figure 5 shows a detailed comparison of the 3-way gene–drug– disease relationships found in the clinical trials (blue1 circle), Pub-Med abstracts (green circle) and PharmGKB (red circle). 124 (51.7%) and 68 (28.3%) of the relationships found in ClinicalTrials.gov were also present in PubMed and PharmGKB, respectively. For the common 51 drug–gene–disease relationships which were found in all three sources, approximately 75% of them occurred earlier in trials than in PharmGKB or PubMed.

Figure 5
Comparison of gene–drug–disease relationships identified from different sources. A total of 240 relationships were found in ClinicalTrials.gov. 124 and 68 such relationships were found to be overlapping with 1162 results in PubMed and ...
 
PMC3760158F5.gif

Our approach also identified 99 gene–drug–disease relationships which are currently missing in both PharmGKB and PubMed. Our further analysis shows that majority (65%) of them were found from ongoing trials (e.g., recruiting or active but not recruiting). For example, the ‘UGT1A1’–’irinotecan’–’Gastrointestinal Cancer’ relationship was extracted from a Phase I trial (NCT00654160) which was launched in 2008 and expected to be completed in 2015. In this trial, researchers proposed to study UGT1A1 genotype-based dosing of irinotecan when given together with fluorouracil and leucovorin in treating patients with advanced gastrointestinal cancer. As of August 2010, this ‘UGT1A1’–’irinotecan’–’Gastrointestinal Cancer’ relationship is present in neither PharmGKB nor PubMed.

4.2. Assessment of our automatic approach

For the 240 identified gene–drug–disease relationships by our method, 100 of them were randomly selected for manual review and classification. 74 were judged to be correct extractions: 30 in Category II and 44 in Category III (see category details in Section 3.4). Hence, our text-mining approach achieves an accuracy of 74%.

Table 1 shows 10 examples of correctly identified relationships, as well as their supporting statements and corresponding publications (when available) in the trials. In our evaluation, the first seven relationships were classified to be Category II and the other 3 Category III. Take the ‘UGT1A1’–’irinotecan’–’Lung Cancer’ relationship for example. Our method extracted this relationship from a Phase III clinical trial (NCT00045162) which proposed to determine the association between UGT1A1 polymorphisms and irinotecan-assoicated toxic effect in patients with lung cancer. After 7 years in 2009, the pharmacogenomics results of this trial were published (PMID = 19349543), reporting that UGT1A1 (G-3156A)A/A (drug metabolism) was associated with IP (Irinotecan plus cisplatin) related neutropenia. As of August 2010, these 10 relationships were missing in PharmGKB. Thus, we believe the relationships identified by our approach are valuable for inclusion to related PGx knowledge bases.

Table 1
Examples of correctly identified gene–drug–disease relationships currently absent in PharmGKB.
 
PMC3760158T1.gif

5. Discussion

Our approach shows that ClinicalTrials.gov is rich in revealing gene–drug–disease relationships for PGx studies. Absent from the current PGx knowledge base (PharmGKB), many of the identified PGx relationships are associated with potential clinical outcomes. In this section, we will discuss the issues of coverage and time lag, the practical implications of this research, and the limitations of our approach, as well as future work.

5.1. Coverage

Although we observed a statistically significant overlap between our results and curated facts in PharmGKB (hypergeometric p-value <0.05), some curated PGx relationships were not detected from clinical trials. This is mainly due to the incompleteness of trial registration, especially for the trials held outside of the United States. For example, the relationship between gene ‘CYP2C9’, drug ‘tamoxifen’ and disease ‘breast cancer’ was studied in a clinical trial in Turkey which was not registered in ClinicalTrials.gov. On the other hand, the study results were already published in an article entitled ‘Tamoxifen inhibits cytochrome P450 2C9 activity in breast cancer patients’ (PMID = 17024799). As a result, the clinical outcome investigation on the ‘CYP2C9’–’tamoxifen’–’breast cancer’ relationship was curated based on the publication but not found in ClinicalTrials.gov by our approach. Good news is that ClinicalTrials.gov is now making efforts to collaborate with other countries in creating a universal registration system [32]. This endeavor would promote the accessibility of clinical trials in all countries in the future.

5.2. Time lag

As mentioned in Section 4.1, approximately 75% of the 3-way drug–gene–disease PGx relationships were identified earlier in trials than in publications. For the remaining 25% of the relationships, we found two main reasons why they were found otherwise (i.e., earlier in publications than in trials). First, it is due to the fact that the earliest trial of an identified relationship is not registered in ClinicalTrials.gov. For example, the relationship between gene ‘SULT1A1’, drug ‘tamoxifen’, and disease ‘breast cancer’ was cu-rated in PharmGKB based on a supporting article (PMID = 15024382) published in 2004 but its corresponding trial is missing in ClinicalTrials.gov. On the other hand, a different trial reporting the same relationship was registered in ClinicalTrials.gov in 2008 (NCT00667121). Hence, the first appearance of the ‘SULT1A1’– ‘tamoxifen’–’breast cancer’ relationship in trial records was dated as 2008 by our method—4 year behind the earliest publication year.

Second, it is due to the discrepancy between the nature of clinical trials and curation scope of PharmGKB. A PGx related clinical trial is designed to study the direct relationships between genes, drugs and diseases, (i.e., how genes affect drug responses in patients with specific diseases/conditions). However, both direct and indirect relationships are captured by PharmGKB [40]. For example, in PharmGKB, the clinical outcome annotation for the relationship between gene ‘CYP3A4’, drug ‘pantoprazole’, and disease ‘Gastroesophageal Reflux Disease (GERD)’ is curated based on an article (PMID = 16961157) published in 2006. However, in ClinicalTrials.gov the earliest trial for investigating this relationship was not registered until 2009 (NCT00744419). Therefore, in this case the ‘CYP3A4’–’pantoprazole’–’GERD’ relationship was detected 3 years ahead in the literature than in ClinicalTrials.gov. However, our further examination shows that the curated article (PMID = 16961157) is a review rather an original research report. In that review article, several genes (CYP2C19 and CYP3A4), drugs (amoxicillin, esomeprazole, pantoprazole, etc.), and diseases (Gastroesophageal Reflux and Peptic Ulcer) were discussed but the exact relationship between ‘CYP3A4’, ‘pantoprazole’, and ‘GERD’ was not reported.

Note that in Figure 3a we show that for each trial related article, its publication date is always after its corresponding trial start date. But with respect to PGx relationships, owning to the aforementioned reasons, some may be found earlier in publications than in trials.

5.3. Practical implications of this research

As mentioned earlier, anyone using extracted relationships from this research should be cautioned that some of those relationships are still under investigation and thus not concluded. Nonetheless, we believe these speculative relationships are still valuable for inclusion to relevant knowledge bases (perhaps with special remarks). Below, we use PharmGKB as a representative PGx knowledge base and show two potential practical uses of our research findings:

First, we recommend building cross-links between PharmGKB and ClinicalTrials.gov. Doing so would allow PharmGKB users to readily identify clinical trials in which relevant PGx genes are under investigation for different conditions and interventions. On the other hand, through linking to PharmGKB, ClinicalTrials.gov users can be exposed to the most comprehensive knowledge of PGx concepts such as gene variants and genetic tests.

Second, relationships found in ClinicalTrials.gov but currently missing in PharmGKB may be considered for future curation. In this regard, we have two specific recommendations for prioritizing the list of candidate relationships: (a) based on our analysis, any extracted relationships that are associated with multiple supporting trials should be of high priority; and (b) any relationships that are associated with completed and published clinical trials should be of high priority. For example, the relationship between gene ‘EGFR’, drug ‘gefitinib’, and disease ‘Head and Neck Cancer’ is associated with four clinical trials (i.e., NCT00083057, NCT00088907, NCT00820417 and NCT00169221). Moreover, the study status of one trial (NCT00083057) is indicated as ‘completed’ and its results are published. Thus, the ‘EGFR’–’gefitinib’–’Head and Neck Cancer’ relationship should be of high priority for curation consideration.

5.4. Limitations of our approach and future work

In this study, we used a dictionary-based method for gene, disease, and drug identification for directly associating with the PharmGKB vocabulary. Like any other dictionary-based method, our approach favored precision but failed to identify entity variants not covered by the used dictionaries. Also, due to name ambiguity between entity types, we may occasionally have identified false positives in our results. For example, the PGx gene symbol ‘TPMT’ is also the abbreviation of drug ‘topiramate’. This ambiguity directly led to an error in gene identification from trial (NCT00884884) in which TPMT is indicated as the short form of the antiepileptic drug ‘topiramate’.

In relationship extraction, we used a co-occurrence based method for identifying relationships between genes, drugs, and diseases. Although this method has been successfully applied in a number of studies such as [41, 42, 43], it has certain limitations: (a) not all co-occurred relationships are actually meaningful (accounting for 26% of the errors in relationship extraction); and (b) we cannot characterize the types of relationships extracted.

In the future, we plan on (a) improving the methods for PGx concept identification and relationship extraction using more sophisticated NLP techniques such as dependency parsing [44]; (b) designing a method for ranking extraction results by combining features like relevant trial status and numbers; (c) developing a robust method for linking clinical trials to their corresponding publications when they are not manually supplied by the trial investigators; (d) developing an automatic method to detect specific gene variants and allele changes which affect drug response reported in trial results and further link them to a standardized gene variation database such as dbSNP [45].

6. Conclusions

The clinical trial is at a critical juncture in the drug development pipeline, connecting previous studies on molecular mechanism with a final decision of approval. We successfully developed a systematic approach to automatically identify clinical PGx information from registered clinical trials. In this study, we collected 93,661 clinical trial records from ClinicalTrials.gov and used a dictionary-based method to identify and normalize PGx concepts (i.e., diseases, drugs and genes) in the texts of the collected trial records. In relationship extraction, we used a co-occurrence based method for identifying relationships between genes, drugs, and diseases. To facilitate the retrieval of PGx information from clinical trials, we built an index for the PGx concepts and the trials collected in our study. Hence, given a pair of PGx gene–drug relationship, our approach can return trials in which the PGx pair is studied under different conditions and controls. In comparative evaluation, we show that ClinicalTrials.gov is a rich source of PGx gene–drug–disease relationships. Manual review shows that our automatic identification method achieves an accuracy of 74%. By comparing our results with the relationships identified from PubMed abstracts and in PharmGKB, we found that our approach can potentially enrich current resources and accelerate the dissemination of clinical outcome information of pharmacogenomics.

Acknowledgments

This research was supported by the Intramural Research Program of the National Institutes of Health, National Library of Medicine. The authors would like to thank Dr. John Wilbur for his helpful comments on this manuscript and Dr. Bastien Rance for his valuable discussion on the related work. The authors would also like to thank the PharmGKB team for clarifying their curation scope and discussing the usefulness of our work.

Abbreviations

PGx
pharmacogenomics
PharmGKB
pharmacogenomics knowledge base

Footnotes

For interpretation of color in Figure. 5, the reader is referred to the web version of this article.

References

1

Evans WE, Relling MV. Pharmacogenomics: translating functional genomics into rational therapeutics. Science. 1999;286(5439):487–491. [PubMed]
 

2

Penny MA, McHale D. Pharmacogenomics and the drug discovery pipeline: when should it be implemented? Am J Pharmacogenomics. 2005;5(1):53–62. [PubMed]
 

3

Klein TE, Chang JT, Cho MK, Easton KL, Fergerson R, Hewett M, et al. Integrating genotype and phenotype information: an overview of the PharmGKB project. Pharmacogenetics research network and knowledge base. Pharmacogenomics J. 2001;1(3):167–170. [PubMed]
 

4

Thorn CF, Klein TE, Altman RB. Pharmacogenomics and bioinformatics: PharmGKB.Pharmacogenomics. 2010;11(4):501–505. [PMC free article] [PubMed]
 

5

Altman RB, Flockhart DA, Sherry ST, Oliver DE, Rubin DL, Klein TE. Indexing pharmacogenetic knowledge on the World Wide Web. Pharmacogenetics. 2003;13(1):3–5. [PubMed]
 

6

Garten Y, Coulet A, Altman RB. Recent progress in automatically extracting information from the pharmacogenomic literature. Pharmacogenomics. 2010;11(10):1467–1489. [PMC free article][PubMed]
 

7

Chan AW, Hrobjartsson A, Haahr MT, Gotzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA.2004;291(20):2457–2465. [PubMed]
 

8

Zarin DA, Tse T. Medicine. Moving toward transparency of clinical trials. Science.2008;319(5868):1340–1342. [PMC free article] [PubMed]
 

9

The Food and Drug Administration Modernization Act of. :113. Pub. L. No. 105–115.
 

10

ClinicalTrials.gov. < http://ClinicalTrials.gov/>.
 

11

Hurdle JF, Botkin J, Rindflesch TC. Leveraging semantic knowledge in IRB databases to improve translation science. AMIA Annu Symp Proc. 2007:349–353. [PMC free article] [PubMed]
 

12

Rennie D. Trial registration: a great idea switches from ignored to irresistible. JAMA.2004;292(11):1359–1362. [PubMed]
 

13

Xu R, Garten Y, Supekar KS, Das AK, Altman RB, Garber AM. Extracting subject demographic information from abstracts of randomized clinical trial reports. Stud Health Technol Inform.2007;129(Pt 1):550–554. [PubMed]
 

14

Xu R, Supekar K, Huang Y, Das A, Garber A. Combining text classification and Hidden Markov Modeling techniques for categorizing sentences in randomized clinical trial abstracts. AMIA Annu Symp Proc. 2006:824–828. [PMC free article] [PubMed]
 

15

Orloff J, Douglas F, Pinheiro J, Levinson S, Branson M, Chaturvedi P, et al. The future of drug development: advancing clinical trial design. Nat Rev Drug Discov. 2009;8(12):949–957. [PubMed]
 

16

Kramer JA, Sagartz JE, Morris DL. The application of discovery toxicology and pathology towards the design of safer pharmaceutical lead candidates. Nat Rev Drug Discov. 2007;6(8):636–649.[PubMed]
 

17

Introduction to drug utilization research. Oslo, Norway: World Health Organization; p. 8.
 

18

ClinicalTrials.gov identifier to be added to MEDLINE®/PubMed® data. <http://www.nlm.nih.gov/pubs/techbull/mj05/mj05_ct.html>.
 

19

Tsai MH, Lin KM, Hsiao MC, Shen WW, Lu ML, Tang HS, et al. Genetic polymorphisms of cytochrome P450 enzymes influence metabolism of the antidepressant escitalopram and treatment response. Pharmacogenomics. 2010;11(4):537–546. [PubMed]
 

20

Krajinovic M, Costea I, Chiasson S. Polymorphism of the thymidylate synthase gene and outcome of acute lymphoblastic leukaemia. Lancet. 2002;359(9311):1033–1034. [PubMed]
 

21

Jensen LJ, Saric J, Bork P. Literature mining for the biologist: from information retrieval to biological discovery. Nat Rev Genet. 2006;7(2):119–129. [PubMed]
 

22

Chang JT, Schutze H, Altman RB. GAPSCORE: finding gene and protein names one word at a time.Bioinformatics. 2004;20(2):216–225. [PubMed]
 

23

Caporaso JG, Baumgartner WA, Jr, Randolph DA, Cohen KB, Hunter L. MutationFinder: a high-performance system for extracting point mutation mentions from text. Bioinformatics.2007;23(14):1862–1865. [PMC free article] [PubMed]
 

24

Garten Y, Altman RB. Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text. BMC Bioinformatics. 2009;10(Suppl. 2S6) [PMC free article][PubMed]
 

25

Ahlers CB, Fiszman M, Demner-Fushman D, Lang FM, Rindflesch TC. Extracting semantic predications from Medline citations for pharmacogenomics. Pac Symp Biocomput. 2007:209–220.[PubMed]
 

26

Theobald M, Shah N, Shrager J. Extraction of conditional probabilities of the relationships between drugs, diseases, and genes from PubMed guided by relationships in PharmGKB. Summit on Translat Bioinforma. 2009:124–128. [PMC free article] [PubMed]
 

27

Coulet A, Shah NH, Garten Y, Musen M, Altman RB. Using text to build semantic networks for pharmacogenomics. J Biomed Inform. 2010;43(6):1009–1019. [PMC free article] [PubMed]
 

28

Coulet A, Shah N, Hunter L, Barral C, Altman RB. Extraction of genotype–phenotype–drug relationships from text: from entity recognition to bioinformatics application. Pac Symp Biocomput.2010:485–487. [PMC free article] [PubMed]
 

29

Cohen KB, Garten Y, Hahn U, Shah NH. Mining the pharmacogenomics literature – workshop introduction. Pac Symp Biocomput. 2011:362–363.
 

30

Tu SW, Peleg M, Carini S, Bobak M, Ross J, Rubin D, et al. A practical method for transforming free-text eligibility criteria into computable criteria. J Biomed Inform. 2011;44(2):239–250.[PMC free article] [PubMed]
 

31

Weng C, Wu X, Luo Z, Boland MR, Theodoratos D, Johnson SB. EliXR: an approach to eligibility criteria extraction and representation. J Am Med Inform Assoc. 2011 [PMC free article] [PubMed]
 

32

Zarin DA, Ide NC, Tse T, Harlan WR, West JC, Lindberg DA. Issues in the registration of clinical trials. JAMA. 2007;297(19):2112–2120. [PubMed]
 

33

Lindberg DA, Humphreys BL, McCray AT. The unified medical language system. Methods Inform Med. 1993;32(4):281–291. [PubMed]
 

34

Ide NC, Loane RF, Demner-Fushman D. Essie: a concept-based search engine for structured biomedical text. J Am Med Inform Assoc. 2007;14(3):253–263. [PMC free article] [PubMed]
 

35

Luo Z, Duffy R, Johnson S, Weng C. AMIA summits on translational science proceedings. San Francisco, USA: 2010. Corpus-based approach to creating a semantic lexicon for clinical research eligibility criteria from UMLS; pp. 26–30. [PMC free article] [PubMed]
 

36

Luo Z, Johnson SB, Weng C. AMIA annual symposium proceedings. Washington, DC, USA: 2010. Semi-automatically inducing semantic classes of clinical research eligibility criteria using UMLS and hierarchical clustering; pp. 487–491. [PMC free article] [PubMed]
 

39

Medical Subject Headings (MeSH®) < http://www.nlm.nih.gov/mesh/> .
 

40

How are pharmacogenomics articles annotated in PharmGKB? <http://www.pharmgkb.org/resources/faqs.jsp#FAQs-annotatedPKPD> .
 

41

Srinivasan P, Libbus B. Mining MEDLINE for implicit links between dietary substances and diseases.Bioinformatics. 2004;20(Suppl. 1):i290–i296. [PubMed]
 

42

Hristovski D, Peterlin B, Mitchell JA, Humphrey SM. Using literature-based discovery to identify disease candidate genes. Int J Med Inform. 2005;74(2–4):289–298. [PubMed]
 

43

Frijters R, van Vugt M, Smeets R, van Schaik R, de Vlieg J, Alkema W. Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Comput Biol. 2010;6(9)[PMC free article] [PubMed]
 

44

De Marneffe MC, MacCartney B, Manning CD. The international conference on language resources and evaluation. Genoa, Italy: 2006. Generating typed dependency parses from phrase structure parses; pp. 449–454.
 

45

Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. DbSNP: the NCBI database of genetic variation. Nucl Acids Res. 2001;29(1):308–311. [PMC free article] [PubMed]

Euretos

The cost of bringing a new drug or therapy to market has nearly tripled in the past 10 years to approximately 1.8 billion US$. One of the main causes for this steady increase in costs is the high attrition rate of leads occurring very late in the approval process in the most expensive clinical Phase II & III stages. By using BRAIN, especially at the early stages of the R&D cycle, lead quality will improve, lowering late stage lead attrition. With BRAIN you minimise time to knowledge enabling researchers to maximise lead quality and:

  • Deepen the understanding of disease mechanisms
  • Increase the number of high quality leads
  • Decrease wasted investment in lead optimisation infrastructure and capacity
  • Assess more effectively pre-clinical in vitro and in vivo test results
  • Be alerted immediately on pin pointed hypotheses that are essential to R&D success

Download the BRAIN factsheet MY NOTE: See Below

 

Joomla template created with Artisteer.
 

Fact Sheet

 
Fact Sheet - BRAIN[Ξ]
Fact Sheet - Bio Relations and Intelligence Network [BRAIN[Ξ]]
Bio Relations and Intelligence Network [BRAIN[Ξ]]
 
BRAIN[Ξ]
Data Sources include
Publications
  • Pubmed

Biology

  • Swissprot
  • WikiPathways
  • Enzyme
  • Human Metabolome

Pharmacology

  • ChEBI
  • ChEMBL
  • Chemspider
  • Drugbank

Genetics

  • Gene Ontology
  • OMIM
  • GWAS Central
  • LOVD
  • RIken Phantom 5
  • Genbank
  • Homologene
  • KEGG
  • Mouse Genome Database

Patents

  • USPTO

People [scientists]

  • Wiki People

Other

  • UMLS
  • Toxicogenomics
Euretos is a start-up that was founded in 2012. The company is privately held. The founders of Euretos have extensive backgrounds in bio-informatics & semantics, IT integration and IT support, network solutions, big data solutions and decision support applications. Euretos works closely with a number of leading academic partners including Leiden University Medical Center, the Dutch Techcenter for life Sciences, the Netherlands eScience center and the Erasmus University Medical Center.
 

Late stage lead attrition in biotech & pharma R&D

The cost of bringing a new drug or therapy to market has nearly tripled in the past 10 years to approximately 1.8 billion US$1. This is causing severe pressure on the long term sustainability of biomedical R&D. One of the main causes for this steady increase in costs is the high attrition rate of leads occurring very late in the approval process, especially in the most expensive clinical Phase II & III stages. Increasing ‘intended’ attrition in the early stages of biomedical R&D (i.e. identifying low quality leads as soon as possible) and as a result bringing down late stage attrition is one of the most important priorities in biotech & pharma R&D today.
 
BRAINFactSheetFigure1.png

BRAIN[Ξ] – addressing lead attrition

BRAIN addresses the issue of late stage lead attrition by providing unprecedented levels of access to life sciences knowledge to researchers throughout the entire R&D and approval lifecycle. One of the key elements is the abundance of publicly available information. It is hard to imagine answering todays research questions without consultation of key public resources such as Pubmed, Swissprot, Drugbank or the US Patent Office Database. Getting useful results from these data sources however is very challenging: there is too much relevant data and information is stored in too many disconnected data sources. Manually searching all these sources is time consuming; the results are at best incomplete and making the connections between results found is simply impossible. This situation is one of the biggest frustrations of life scientists today.

BRAIN[Ξ] – The Euretos ‘Bio Relations and Intelligence Network’

The Euretos’ Bio Relations and Intelligence Network [BRAIN[Ξ]] brings together an unprecedented range of scientifically high value data sources [see insert on the left]. BRAIN[Ξ] reads these sources, recognises the terms mentioned and stores the statements involving these terms as a relation between the terms. BRAIN[Ξ] thus provides one single, ‘bio relations’ view on all underlying data sources. Updates of sources and new sources are continuously added to BRAIN[Ξ] allowing it to provide immediate alerts. With BRAIN[Ξ] you are able to:
  • Deepen the understanding of disease mechanisms
  • Increase the number of high quality leads
  • Decrease wasted investment in lead optimisation infrastructure and capacity
  • Assess more effectively pre-clinical in vitro and in vivo test results
  • Be alerted immediately on pin pointed hypotheses that are essential to R&D success
 
1 (Nature Reviews Drug Discovery, March 2010, on R&D for pharmaceutical and small biotechnology companies)
 
BRAINFactSheetFigure2.png

Get answers ‘now’!

With BRAIN[Ξ], key research questions can be answered that till now took either days of effort or were simply too complex to undertake. Imagine getting answers to the following questions with just a few clicks:
  • “Rank a given set of potential leads on their ADME-T properties”
  • “Which cell surface proteins have high expression in kidney tissue and low ones in others?”
  • “Which biomarkers of hepatocellular carcinoma are part of the cirrhosis pathway”
  • “How does LTB4 play a role in pulmanary hypertension
  • “List me all the oxidoreductase inhibitors active at <100 nM in both human and mouse”
  • “List all co-factors for the enzymes shorter than 300 AA in Hexose Transport”
  • “Which SNPs are associated with colour in my crop species?”
  • “Is there an explanation in literature for an association I found in a high throughput data set?”
  • “For a specific receptor give me all known agonists”
  • “For a given protein-protein interaction give me all known inhibitors”

Unique value

BRAIN[Ξ] is simply unique and provides value that is not available elsewhere:
 
  • It contains over 99% of biomedical and biochemical life sciences concepts by leveraging the most important and widely used life sciences ontologies.
  • It brings together an unprecedented scope of public scientific data sources ranging from publications to gene databases, protein data sets, pathway spaces and patent information.
  • These data sources are updated and new sources are added continuously by Euretos.
  • All data is read, recognised and stored as relations enabling all data to be linked.
  • All redundancy is removed from the data sources by making sure each term or relation is fully recognised and therefore stored only once.
  • All relations are ‘scientifically valued’; for each statement a scientific value is created based on the number of reputable sources or scientists that have contributed to the statement.
  • You can always drill down to the underlying publications or data sources to explore at the source level.

Benefits

By using BRAIN[Ξ] especially at the early stages of the R&D cycle, lead quality will improve, lowering late stage lead attrition. This will have direct impact on the costs levels and lead time of the biomedical R&D and approval process.

Contact

If you would like to know more about how BRAIN[Ξ] can improve the return on your R&D investments visit our website http://www.euretos.com or contact us at: Information@euretos.com

Appendix 1 – Selection of relevant publications

1: Nanopublication in the e-science era: Mons, B. & Velterop, J. Workshop on Semantic Web Applications in Scientific Discourse (SWASD 2009) (Washington, DC, USA, 2009).
 
2: The anatomy of a nanopublication: Paul Groth, Andrew Gibson, Jan Velterop
Information Services and Use, Vol. 30, No. 1. (1 January 2010), pp. 51-56, doi:10.3233/ISU-2010-0613
 
3: In silico knowledge and content tracking. Van Haagen H, Mons B.
Methods Mol Biol. 2011;760:129-40.
 
4:The value of data: Mons B, van Haagen H, Chichester C, Hoen PB, den Dunnen JT, van Ommen G, van Mulligen E, Singh B, Hooft R, Roos M, Hammond J, Kiesel B, Giardine B, Velterop J, Groth P, Schultes E.
Nat Genet. 2011 Mar 29;43(4):281-3.
 
5: In silico discovery and experimental validation of new protein-protein interactions: van Haagen HH, 't Hoen PA, de Morrée A, van Roon-Mom WM, Peters DJ, Roos M, Mons B, van Ommen GJ, Schuemie MJ.
Proteomics. 2011 Mar;11(5):843-53. doi: 10.1002/pmic.201000398. Epub 2011 Jan 31.
 
6: Novel Protein-Protein Interactions Inferred from Literature Context: van Haagen HH, 't Hoen PA, Botelho Bovo A, de Morrée A, van Mulligen EM, Chichester C, Kors JA, den Dunnen JT, van Ommen GJ, van der Maarel SM, Kern VM, Mons B, Schuemie MJ
PLos ONE 2009, November 18
 
7: Placing landMarks in the Knowledge Space: crowd-sourcing landmark publications for benchmarking text-mined predictions: Thompson M, van Haagen H, Mons B, Schultes E.
In proceedings of the Semantic Web Applications and Tools for the Life Sciences (SWAT4LS 2012), Paris, France, Nov 2012.
 
8: Theoretical and technological building blocks for an innovation accelerator: F van Harmelen, G Kampis, K Börner, P van den Besselaar, E Schultes, C Goble ...
The European Physical Journal Special Topics 214 (1), 183-214
 
9: Microattribution and nanopublication as means to incentivize the placement of human genome variation data into the public domain: GP Patrinos, DN Cooper, E van Mulligen, V Gkantouna, G Tzimas, Z Tatum, E ...
Human Mutation 3, 2012
 
10: Solving bottlenecks in data sharing in the life sciences: R Dalgleish, E Molero, R Kidd, M Jansen, D Past, A Robl, B Mons, C Diaz, A ...
Human Mutation
 
11: Open PHACTS: Semantic interoperability for drug discovery: AJ Williams, L Harland, P Groth, S Pettifer, C Chichester, EL Willighagen ...
Drug Discovery Today. To appear 17, 2012
 
12: Speeding up research with the Semantic Web: M Roos, EA Schultes, B Mons
Orphanet Journal of Rare Diseases 7 (Suppl 2), A11
 

BRAIN Sample Output

PDF

 

BRAINSampleOutput.png

BRAIN

Source: http://www.euretos.com/brain

Maximise lead quality 

The Euretos’ Bio Relations and Intelligence Network [BRAIN] brings together an unprecedented range of scientifically high value data sources essential for assessing biomedical lead quality. BRAIN reads these sources, recognises the terms mentioned and stores the statements involving these terms as a relation between the terms. BRAIN thus provides one single, ‘bio relations’ view on all underlying data sources.  Updates of sources and new sources are continuously added to BRAIN allowing it to provide immediate alerts. 

With BRAIN, key research questions can be answered that till now took either days of effort or were simply too complex to undertake. Imagine getting answers to the following questions with just a few clicks:

  • "Rank a given set of potential leads on their ADME-T properties”
  • “Which cell surface proteins have high expression in kidney tissue and low ones in others?”
  • “Which biomarkers of hepatocellular carcinoma  are part of the cirrhosis pathway”
  • “How does LTB4 play a role in pulmanary hypertension
  • “List me all the oxidoreductase inhibitors active at <100 nM in both human and mouse”
  • “List all co-factors for the enzymes shorter than 300 AA in Hexose Transport”
  • “Which SNPs are associated with colour in my crop species?”
  • “Is there an explanation in literature for an association I found in a high throughput data set?”
  • “For a specific receptor give me all known agonists”
  • “For a given protein-protein interaction give me all known inhibitors”

Minimise time to knowledge 

BRAIN is simply unique and provides value that is not available elsewhere: 

  • It contains over 99% of biomedical and biochemical life sciences concepts by leveraging the most important and widely used life sciences ontologies.
  • It brings together an unprecedented scope of public scientific data sources ranging from publications to gene databases, protein data sets, pathway spaces and patent information.
  • These data sources are updated and new sources are added continuously by Euretos.
  • All data is read, recognised and stored as relations enabling all data to be linked.
  • All redundancy is removed from the data sources by making sure each term or relation is fully recognised and therefore stored only once.
  • All relations are ‘scientifically valued’; for each statement a scientific value is created based on the number of reputable sources or scientists that have contributed to the statement.
  • You can always drill down to the underlying publications or data sources to explore at the source level.

​​Eurotos

Source: http://www.euretos.com/euretos

Euretos is a start-up that was founded in 2012. The company is privately held. The founders of Euretos have extensive backgrounds in bio-informatics & semantics, IT integration and IT support, network solutions, big data solutions and decision support systems. Euretos works closely with a number of leading academic partners including Leiden University Medical Center, the Dutch Techcenter for life Sciences, the Netherlands eScience Center and the Erasmus University Medical Center.

The leadership team of Euretos:

 

Albert Mons - CEO

Albert has over a 10 year background in the bioinformatics & semantics domain where he founded a number of ventures taking on sales and CEO roles.

   

 

Marco Wanders - Head of Sales 

Marco has an extensive background in sales & marketing in companies such as Microsoft and Redback where he has been involved in major sales deals and a multi-billion dollar IPO.

 

 

Onno Becker Hof - CTO 

Onno has been leading technological R&D for over 18 years in high innovation environments focusing in particular on high performance & availability solutions.

 

Aram Krol - Head of Product Development

Aram has over 15 years of experience in product development of  global high transactional messaging systems and decision support solutions.

Arie Baak - Head of Market Development 

Arie has over 17 years of experience in marketing and proposition development in 'big data' environments with a particular focus on the life sciences since 2011.

Contact

Source: http://www.euretos.com/contact

  Please fill out the form for requesting a webinar, asking a specific question or for one our team members to contact you.

  (If you are a user of BRAIN and need support please log in to MyBRAIN here)

 

 

Name  
Company  
Position  
email address  
re-enter email address  
Request a webinar     select  
Please contact me     select  
I have a question  
      

Login

Source: http://www.euretos.com/login

 

 
 
 

Biosemantics

Source:http://www.biosemantics.org/new/

Welcome to the website of the Biosemantics Group.​

With the explosion of information in the biomolecular field, there is a dire need for tools that assist biologists in retrieving, extracting, and relating information and knowledge in the literature and in molecular databases. The Biosemantics Association develops and evaluates such tools, focussing on the elucidation of hidden or implicit knowledge by the massive meta-analysis of textual documents.
The groups currently address three areas of research:

1. Concept identification and disambiguation algorithms. 
Proper recognition of concepts that characterize a document and disambiguation of terms that can have more than one meaning, is the basis for all subsequent steps in text analysis. We make extensive use of thesauri that contain the concepts relevant for a particular field.


2. Meta-analysis and visualization techniques.
For meta-analysis, we are studying different approaches that relate the information in many (possibly hundreds of thousands) documents from the literature. Visualization deals with reducing the often multi-dimensional output of the meta-analysis tools to two dimensions that can easily be interpreted.


3. Evaluation of applications in the biological field.

In several studies we are investigating the potential of the developed technology to interconnect genes and proteins and discover knowledge that is hidden in the literature, while the technology is also being evaluated for semi-automated annotation of protein function. Development and evaluation is done in close cooperation with domain experts, both in national and international collaborations.

Background

The Biosemantics Group is a collaboration between the Medical Informatics department of the ErasmusMC University Medical Center of Rotterdam and the Center for Human and Clinical Genetics of the Leiden University  

 

Immune activation and collateral damage in AIDS pathogenesis

Source: http://www.frontiersin.org September 2013 | Volume 4 | Article 298 | PDF

REVIEW ARTICLE

published: 26 September 2013
doi: 10.3389/fimmu.2013.00298

Frank Miedema1*, Mette D. Hazenberg2, KikiTesselaar 1, Debbie van Baarle1, Rob J. de Boer 3 and
José A. M. Borghans1
1 Department of Immunology, University Medical Center Utrecht, Utrecht, Netherlands
2 Department of Internal Medicine and Hematology, Academic Medical Center, Amsterdam, Netherlands
3 Theoretical Biology and Bioinformatics, Utrecht University, Utrecht, Netherlands
 
Edited by:
Teunis Geijtenbeek, University of Amsterdam, Netherlands
 
Reviewed by:
Cynthia Ann Derdeyn, Emory University, USA
William Anderson Paxton, Academic Medical Center, Netherlands
 
*Correspondence:
Frank Miedema, Department of Immunology, University Medical Center Utrecht, Heidelberglaan 100, 3584CX, Utrecht, Netherlands
e-mail: f.miedema@umcutrecht.nl
In the past decade, evidence has accumulated that human immunodeficiency virus (HIV)-induced chronic immune activation drives progression to AIDS. Studies among different monkey species have shown that the difference between pathological and non-pathological infection is determined by the response of the immune system to the virus, rather than its cytopathicity. Here we review the current understanding of the various mechanisms driving chronic immune activation in HIV infection, the cell types involved, its effects on HIV-specific immunity, and how persistent inflammation may cause AIDS and the wide spectrum of non-AIDS related pathology. We argue that therapeutic relief of inflammation may be beneficial to delay HIV-disease progression and to reduce non-AIDS related pathological side effects of HIV-induced chronic immune stimulation. 
 
Keywords: AIDS, pathogenesis, immune activation, TLR, Immunity, therapy
 

CHRONIC IMMUNE ACTIVATION IS THE PRIMARY DRIVER IN HIV PATHOGENESIS

Upon discovery of the virus that causes AIDS, the name human immunodeficiency virus (HIV) was coined because the virus eventually causes severe immune deficiency. This was based on the clinical symptoms with which end-stage HIV-infected patients presented and on the gradual decline of CD4+ T-cell numbers in the blood, which is still considered a hallmark of HIV-disease progression. The finding that HIV is confined to CD4+ leukocytes and is cytopathic for CD4+ T-cells established the hypothesis that HIV causes immune deficiency by directly killing CD4+ T-cells and impeding CD4+ T-cell renewal (1). The molecular mechanisms involved in CD4+ T-cell killing by HIV infection have been studied in great detail, leading to novel insights into the down-stream effects of abortive infection and viral integration on cell death (2, 3, 4). However, increased apoptosis rates in HIV-infected individuals are not confined to infected CD4+ T-cells, but are also observed in non-infected CD4+ T-cells and in cell types that are not even targets for HIV infection, suggesting that the cytopathic effects of HIV are not the full story (5, 6).
 
Paradoxically, HIV induces strong cellular immune responses, both with respect to magnitude and breadth (7, 8, 9, 10, 11), and even in progressive HIV infection, high avidity HIV-specific CD8C T-cells are being induced (12). Both CD4+ and CD8+ T-cells are more activated in acute and chronic HIV infection, and hence proliferate rapidly and have a short half life. This explains why both
T-cell production and death rates are increased throughout HIV infection (13, 14). At first, the high division rate of CD4+ T-cells in untreated HIV-infected patients was interpreted to reflect a homeostatic response to the loss of CD4+ T-cells (15, 16, 17, 18). Studies in patients on combination anti-retroviral therapy (cART) pointed out, however, that T-cell proliferation rates drop concomitant with the loss of virus, even when CD4+ T-cell numbers are still far below healthy control levels, suggesting that the increased T-cell division rates are caused by the virus itself. It became clear that chronic immune activation is a hallmark of pathogenic HIV infection, exemplified by the increased expression of soluble and cellular immune activation markers, including IFNa, TNFa, and sTNFR and the increased fraction of activated CD8+ T-cells; markers that have long been used as surrogate markers for HIV-disease progression (19, 20,21, 22, 23, 24, 25, 2627). In fact, the level of immune activation is the best predictor of progression to AIDS (28, 29) and death (22, 30, 3132), independent of HIV viral load. HIV-2 infection is characterized by an overall slower progression rate, lower viral loads, and higher CD4+ T-cell numbers than HIV-1 infection (33). Yet, the cytopathicity of HIV-2 for human CD4+ lymphoid cells is not lower compared to HIV-1 (34). A striking difference between the two viral subtypes is that the level of immune activation is lower in HIV-2 compared to HIV-1 infection, although expression patterns and prognostic values for immune activation markers were found to be similar when patients with HIV-1 or HIV-2 infection were matched for CD4+ T-cell depletion levels (35, 36) These observations were paralleled by insights from simian immunodeficiency virus (SIV) infection in sooty mangabeys (SMs) and African green monkeys (AGMs). SIV infection in these animals is characterized by high viral loads without high levels of immune activation, and does not lead to AIDS, which will be discussed in detail in the Box 1 below (37, 38). Together, these observations have gradually shifted the paradigm from the classical hypothesis that viral cytopathicity is the primary driver of CD4+ T-cell depletion and immune deficiency, to the hypothesis that chronic immune activation is the cause of T-cell depletion and immune deficiency (35, 39).

Box 1 Damage control in non-pathogenic SIV infection

In pathogenic SIV infection in rhesus macaques (RMs), high levels of immune activation are associated with progression to AIDS.SIV infection in sooty mangabeys (SMs) and African green monkeys (AGMs), in contrast, do not lead to AIDS despite high viral loads (37, 87, 88, 89, 90, 91). Interestingly, SM do not mount stronger cytotoxicT-cellor neutralizing antibody responses to SIV compared to RM, and productively infected CD4+ T-cells in SIV-infected SM and RM have similar life spans (9294). Several lines of evidence show that systemically LPS induces features of pathogenic SIV infection (95), that pre-existing microbial translocation and loss of GI integrity in pigtail macaques was associated with faster SIV disease progression (96). In non-pathogenic, like in pathogenic SIV infection, however, a severe depletion of memory T-cells in the gut occurs, apparently without causing generalized immune activation in non-pathogenic SIV infection (54, 55).
 
As the dynamics of virus and virus-infected CD4+ T-cells in these animal models of SIV infection are comparable, excessive indirect activation-induced killing of Tcells in rhesus macaques has been proposed to be the major pathological difference (37, 38, 87, 97100). Indeed, despite the fact that RM develop strong immune responses upon SIV infection, these responses fail to clear the virus,resulting in persistently high levels of immune activation throughout infection (38, 101).
 
Compelling evidence has been obtained for a SM-specific polymorphism in TLR signaling, leading to attenuated production of typeI IFN by pDCs induced via TLR7/9 activation in SIV-infected SM (102, 103). The gene involved, IRF-7, is a signaling protein down stream of TLR7 and 9. Interestingly, TLR7- and 9-induced production of TNF a appeared to be unaffected in SM, which agrees with the fact that TNFa release is mediated by the NF-kB and not by the IRF-7 pathway.This observation suggests that release of type IIFNs, but not TNFa, may be critical for SIV pathogenes is, which makes IFNa and IRF-7 potential drug targets. Despite the inability of SM to produce high levels of type I IFNs upon TLR7/9 activation by SIV, peak viremia during acute SIV infection in these animals is accompanied by clear signs of an innate and adaptive immune response, including the induction of IFN-stimulated genes (ISGs) (104, 105). Gene expression profiling showed the induction of ISGs, acute inflammatory genes, and genes associated with chemo taxis and neutrophil recruitment, DC activation and maturation, apoptosis, and cytotoxic T-cell responses during the acute phase of both pathogenic and non-pathogenic SIV infection (83, 104106). In SM and AGM, expression of ISGs returns to normal levels after 30 days of infection. Since this decline in inflammation is paralleled by a gene expression program of immune regulatory genes, including genes that down-regulate T-cell responses [e.g., indolamine 2,3 dioxygenase (IDO), IL-10, LAG3, and PD-L1] and genes that down-regulate IFN responses (e.g., adenosinedeaminase),it has been proposed that active down regulation may be involved (83, 104). Further detailed mechanistic studies are required to reveal whether–and if so which–specific down-regulatory pathways are involved. Of note,also host genes implied in intra cellular viral restriction are rapidly up-regulated in non-pathogenic infection (83, 104).
 
If type I IFN is one of the main causes of immune activation in HIV and SIV infection, it remains puzzling how the clear difference in IFNa production by pDC from SIV-infected SM and RM can be reconciled with the apparent similarity of immune responses, and specifically the expression of IFN-inducible genes, observed during acute SIV infection of both species. There is, however, evidence that upregulation of ISGs in acutely SIV-infected SM is induced even though IFNa production by their pDCs is severely diminished (66, 83, 104). Interestingly, Favre and colleagues (66) found upregulation of IFNa, but not IL-12 and IL-6, in acute SIV infection in AGM, although IFNa release was very limited in duration compared to the sustained release of all three cytokines in pathogenic SIV infection. Also the detailed characteristics of immune activation in  acutely SIV-infected RM and SM are quite different. Acute SIV infection in SM (and AGM) is not accompanied by increased CD4+ T-cell turnover, but strong increases in CD8C T-cell activation, division (Ki67 expression) and apoptosis have been observed (99, 102, 107, 108). Thus, both timing and quality of gene expression of pro-inflammatory cytokines seem to be critically different between pathogenic and non-pathogenic SIV infection (109).  Taken together,current data are compatible with the idea that SM and AGM respond to SIV with a limited and transient innate response and with an adaptive response that is mainly restricted to CD8C T cells. In pathogenic SIV infection, an excessive innate response is generated with sustained IFNa and ISG induction which induces proliferation of NK cells and a broad SIV-specific and bystander CD4+ and CD8+ T-cell response (83, 102, 108). It could be that in SM and AGM, low and transient type I IFN responses during acute SIV infection induce a different gene expression program, allowing for resolution and/or down regulation of the immune response during subsequent chronic SIV infection.
 
It has been proposed that damage control in SIV-infected SM may in part be due to the preservation of central memory CD4+ T-cells (Tcmwhich are thought to provide protection against the harmful side effects of bacterial translocation (110). Depletion of memory T-cells from the gut and bacterial translocation occur only transiently during acute SIV infection in SMs (54). In contrast to rhesus macaquesSMs are able to avoid epithelial barrier break down and thereby limit the undesired side effects of bacterial translocation during chronic SIV infection (111). SM are able to spare Tcm from viral infection because of low CCR5 expression (112), while in AGMTcm may be protected against SIV  infection by CD4 down regulation (113). In pathogenic SIV and HIV infection, in contrastTcm are thought to be selectively lost through viral infection (112) However, the observation that the number of activated and naive Tcells, and not the number of Tcm, is predictive for HIV-disease progression does not support the idea that Tcm numbers are most critical (114). High levels of immune activation in pathogenic SIV  infection may promote SIV infection of Tcm, resulting in Tcm depletion  which may contribute to the vicious cycle of loss of immune​ control. Further investigations  GAP Tcm activation  and death.
 
In conclusion, non-pathogenic SIV infection of SM and AGM are examples of a pathogen-host symbiosis with an established state of tolerance. This is not immunological tolerance in the strict sense, but a state of tolerance in which the host resists the pathological effects of the virus by avoiding excessive inflammation (115, 116). Further investigation into the various and potentially different mechanisms by which SM and AGM avoid chronic immune activation is warranted and of great importance for our understanding and the treatment o fHIV disease.

CAUSES OF IMMUNE ACTIVATION IN  HIV INFECTION

It has long been known that innate and adaptive immunity get activated upon acute HIV infection, as extensively described and reviewed elsewhere (35, 3946). Chronic HIV infection is now known to be characterized by increased expression of pro-inflammatory cytokines, including type I IFNs, IL-6, TGFb, IL-8, IL-1a, and IL-1b, serum markers of inflammation including
sCD14, CRP, cystatin C, D-dimers, and activation of the coagulation system (47). In the last couple of years much attention has
focused at the causes of immune activation in HIV infection,with a redirection of research focus from T-cell immunity to innate immunity.

BREACH OF GASTRO-INTESTINAL IMMUNITY

In the late 1990s, acute SIV infection in rhesus macaques (RMs) was shown to induce a severe and rapid depletion of memory
CD4+ T-cells from the gut (48). Later, in both humans and monkeys, it was found that this breach of the gut immune system
resulted in a significant increase in bacterial components, including lipopolysaccharide (LPS), in the blood (4951). LPS is a known
activator of innate immune cells via Toll-like receptor (TLR)4, and LPS concentrations in the circulation of HIV-infected individuals correlated strongly with T-cell activation levels (51, 52). It was concluded that translocation of immune stimulatory bacterial products contributes to systemic immune activation, via TLR activation of various leukocyte populations.LPS was used as an indicator for bacterial translocation, but other bacterial products, such as flagell in, peptidoglycan, and bacterial CpG-rich DNA domains that are recognized by TLR2,5, and 9 respectively, may also contribute to immune activation.It was proposed that the early attack on the memory CD4+ T-cell population in the gut may be a critical determinant of disease progression (53). However, also in non-pathogenic SIV infection a severe depletion of memory T cells in the gut occurs,apparently without causing generalized immune activation (54, 55). Moreover, an attenuated variant of pathogenic SIV mac239 was shown to spare mucosal CD4+ T-cells and yet to cause T-cell activation, CD4+ T-cell loss, and progression to AIDS without any signs of microbial translocation (56), showing that immune activation due to gut damage is not required to develop AIDS. On the other hand, in patients on cART, with very low HIV viral load, residual levels of bacterial translocation were positively correlated with immune activation levels suggesting that bacterial translocation may be a dominant driver of immune activation in patients treated with anti-viral drugs(5763).
 
The breach of gut integrity in pathogenic SIV and HIV infection has been shown to be associated with depletion of CD4Th17 cells, acelltypethatis normally abundant in the mucosa and is known to be involved in immunity to commensal bacteria (64). It is assumed that the immune system normally keeps a delicate balance between T regulatory (Treg) cells and Th17 cells, to protect against pathogens but avoid collateral damage from excessive immune responses (65). The selective loss of Th17 CD4+ T-cells from the gut–possibly due to selective infection–has therefore been held responsible for the long-term loss of the intestinal integrity and thereby for chronic immune activation in pathogenic HIV infection (64, 66, 67). More recently, depletion of IL-21-producing CD4+ T-cells has been observed in both the blood and rectal mucosa of SIV-infected RMs (68). Treatment of these animals with IL-21 resulted in the maintenance of intestinal Th17 cells, and a reduction of microbial translocation and systemic inflammation (69). The dynamics of the Th17/Treg balance and the role of Th17 cells and Th17-derived cytokinesin HIV infection is currently subject of intensive study.

SINGLE-STRANDED RNA, TOLL-LIKE RECEPTORS, AND TYPE IIFN PRODUCTION

In 2004 it was reported that TLR7 and 8 recognize RNA from various viruses (70, 71), and it has been demonstrated that single-
stranded (ss) HIVRNA directly activates the innate immune system via these TLRs (72, 73). After endosomal binding of ssHIV RNA to TLR7, HIV induces the release of type I interferons by plasmacytoiddendritic cells (pDCs) through the up regulation of TRAIL (7275). Single stranded HIVRNA has also been shown to activate NK cells in a TLR7 and 8 dependent way, and this process is dependent on cell–cell contact between pDCs and monocytes (76). Finally, pro-inflammatory responses can be induced through
intracellular recognition of HIV DNA intermediates.The seinter-mediates can be the result of abortive HIV infection of CD4+ T-cells, and induce the production of IFN-b and IL-1b (4). In agreement with these in vitro observations, gene expression analyses
of lymphocytes from HIV-infected persons were shown to have a dominant signature of IFN-stimulated genes (ISGs) (77, 78).
Immediately after start of cART–when virus production and viral load rapidly decline–markers of T-cell activation, expression of pro-inflammatory cytokines such as IFNa, IL-6, IL-1-b, and macrophage inflammatory protein-1a, adhesion molecules VCAM-1 and ICAM-1,and the levels of soluble markers for endothelial cell and coagulation activation are all rapidly and strongly reduced, although not to normal levels (15, 18, 73, 7981). These data suggest that HIV itself, most likely through its ssRNA or
DNA intermediates, is an important driver of immune activation in untreated HIV infection.
 
Type I IFNs provide an important link between chronic innate and adaptive immune activation in HIV infection, because they induce activation and maturation of pDCs, NK cells, Tcells, and B cells (82). Gene expression profile data from pathogenic and non-
pathogenic SIV-infected primates suggest that persistent release of type I IFNs is a particular feature of pathogenic infection (83). It is well established that pDCs are mass producers of type I IFNs (82). At a certain point, pDCs typically become refractory to restimulation by TLR ligands, thereby avoiding excessive immune activation and collateral damage in the course of viral infection
(84, 85). Bhardwaj and colleagues (86) nicely showed that HIV, in contrast to other TLR7 agonists such as influenza virus and herpes simplex virus, induces a partially matured phenotype in pDCs. Because of this phenotype, pDCs are not rendered refractory and continue to produce type I IFNs during ongoing HIV exposure.
 
Interestingly, and similar to what is observed in SIV-infected SMs (102, 104) and AGMs (83), chronically HIV-infected individuals who do not progress to AIDS despite their high viral loads turned out to have very low levels of prolife rating and activated T-cells (117) correlating with relatively low levels of ISGs and immune activation gene expression in CD8+ T-cells (118).A recent study confirmed the central role of IFNa in HIV-1 infection by showing that IFNa is the dominant type I IFN detectable in the plasma of HIV-infected individuals and that its levels correlate with immune activation and depletion of CD4+ T-cells (119). In addition, it was shown that pDCs derived from women produce more IFNa in response to HIV-1 than pDCs from men, resulting in higher levels of T-cell activation (120, 121). This may at least in part explain the observation that HIV-infected women with a given viral load have a 1.6-fold higher risk to develop AIDS than men, and despite having lower viral loads on average, typically progress faster to AIDS than men (122).
 
It has been reported that pDCs from SMs have a species-specific inability to produce high levels of type I IFN (102, 103) related
to sequence polymorphisms in IRF-7, a signaling protein down-stream of TLR7 and 9 (see Box 1).Also in humans, polymorphisms
of IRF-7 have been reported that are associated with the level of HIV-induced IFN a production by pDCs in vitro and with CD8+
T-cell activation in vivo (123). These data stress the importance of the IRF-7 pathway in HIV pathogenesis, although there is no
definite proof yet that IRF-7 itself is responsible for the induction of different responses in different individuals.Together, these
observations suggest that the continuous release of type I IFNs plays a critical role in SIV and HIV pathogenesis. Future studies should point out what the direct and indirect role of IRF-7 polymorphisms is in determining the set point level of chronic immune activation in HIV-infected subjects, and should clarify the potential of IFNa and IRF-7 as drug targets (Figure 1-see below).
Figure 1 Pathways of chronic immune activation and its down-stream effects in HIV infection
 
Miedema_fi03Figure1.png
HIV infection induces chronic immune activation through activation of the innate and the adaptive immune system, via single-stranded (ss) RNA and possibly through intracellular viral DNA which activate pDCs via endosomal TLR7 and 8.This activation leads to the induction of IFNa via the IRF-7 pathway and the induction of IL-6, IL-12, TNFa, and TGFb through the NF-kB pathway. Continuous activation of the lymphocyte compartment leads to attrition of the T-cell pool (14, 15) and “immune paralysis” (e.g., impaired CTL responses). Bacterial translocation may be another source of TLR activation via TLR2, 4, 5, and 9 (4951). Over time also non-AIDS related complications develop. Potential targets for therapeutic interventions with inflammation to diminish pathology are indicated. It has been shown that blocking the effect of TLR 7 and 9 significantly reduces HIV-induced immune activation (124). Studies in pathogenic and non-pathogenic SIV infection suggest that blocking IRF-7 or IFNa should be investigated. In rheumatoid arthritis patients who were treated with TNFa inhibiting agents (infliximab, etanercept) it was shown that blocking the effect of TNFa reversed the increased incidence of cardiovascular complications and insulin resistance. In analogy, the potential for a therapy interfering with TNFa in HIV infection should be tested (125127).

PATHOGENIC EFFECTS OF IMMUNE ACTIVATION AND INFLAMMATION

The key role of chronic immune activation in HIV and SIV patho genesis is now commonly accepted, as it is so clearly associated
with CD4+ T-cell decline and progression to AIDS.The clinical outcome of HIV infection, however, does not only depend on CD4+ T-cell loss,but also on non-immunological side effects of chronic immune activation.

INFLAMMATION DRIVES CD4+ T-CELL DEPLETION AND LOSS OF HIV-SPECIFIC IMMUNITY

A  large body of work has suggested that chronic immune activation in HIV infection has deleterious effects on immune function in general, as well as on HIV-specific immunity by inducing persistent activation and maturation of all sorts of innate and adaptive
immune cells (82). Through continuous activation and differentiation of Tcells, chronic HIV infection gradually depletes the naive CD4+ and naive CD8+ T-cell pools (31, 35, 43, 128, 129). Intrinsically different responses of the distinct T-cell lineages to activation may determine clonal expansion and contraction (130), and thereby the sensitivity of the different T-cell populations to chronic activation-induced cell loss, although the molecular basis for these differences remains unclear. Thymic and T-cell progenitor dysfunction, most likely caused by aberrantly high levels of pro-inflammatory cytokines expressed during untreated HIV infection, have been reported (43,131) and the loss of such progenitor cells could aggravate the depleting effects of chronic immune activation on the adaptive immune system. Moreover, continuous inflammation in lymphnodes has been suggested to result
in TGFb-induced collagen deposition, fibrosis, and pathological changes in lymph node architecture, possibly adding to impaired
T-cell proliferation and survival (132, 133134). Continuous activation has recently been shown to induce upregulation of inhibitory
receptors such as programed death-1 (PD-1), CTLA-4, and Tim-3, which may interfere with ongoing HIV-specific T-cell responses,
and ultimately lead to T-cell anergy and loss of HIV-specific T cells (135, 136, 137). Similarly, B-cell dysfunction, which is observed
immediately after acute HIV infection (138), is closely related to chronic activation of the B-cell compartment. Increased B-cell turn over and differentiation is associated with the phenotypic and functional B-cell abnormalities characteristic for untreated HIV infection (139, 140, 141, 142). A recent study showed the down regulation of the regulatory receptor B-andT-lymphocyte attenuator (BTLA) and the upregulation of PD-1 on B cells in HIV infection (143). Interestingly, a direct down-regulating effect of type I IFN on BTLA expression on CD4+ and CD8+ T cells has been reported, which may directly contribute to T-cell hyperactivation (144). Recently evidence was reported for a link between PD-1L on follicular Th cells and impairment of B-cells function (145).
 
Persistent immune activation has also been shown to have deleterious effects on HIV-specific CD4+ (7, 146, 147, 148, 149150, 151, 152153) and CD8+ T-cell immunity (154160), amongst others by preventing the establishment of IL-2-producing memory CD4+ and CD8C T cells (146, 151153). HIV-specific cytotoxic T-cell responses are generally considered to play an important role in anti-HIV immunity. Certain HLA alleles clearly correlate with viral load setpoint and disease progression.In line with this, the major genetic factors related to HIV-1 control coming out of a genome-wide association study (GWAS) were shown to affect HLA–viral peptide interaction (161). There is accumulating evidence that Gag-specific CTL responses which preferentially target conserved epitopes have a protective effect (162171). However, in two large prospective cohort studies, CD4+ and CD8+ HIV Gag-specific T-cell immunity within the first year after HIV seroconversion were not found to be predictive for disease progression (172, 173). This observation was confirmed in a longitudinal study in an African cohort (174). Also in these studies, immune activation turned out to be the strongest risk factor for disease progression, stronger than, and independent of, viral load (172, 173). It is important to consider the possibility that the typical association between strong CTL responses and a lack of HIV-disease progression that is observed in cross-sectional studies, may merely reflect the preservation of CTL responses in the absence of chronic immune activation rather than a protective effect of CTL themselves (175).

HIV-INDUCED INFLAMMATION AND HIV-ASSOCIATED NON-AIDS DISEASE

Increasing insight in the source and the role of inflammation in HIV pathogenesis has been paralleled by recent progress in our understanding of the role of inflammation in a much wider spectrum of clinical conditions than infectious diseases.After the introduction of anti-retroviral therapy for HIV infection,several case studies suggested that patients treated with cART had an
increased risk to develop sub-clinical atherosclerosis and acute myocardialinfarction (176, 177, 178, 179). Initial studies reported that the increased risk of cardiovascular disease was associated with specific classes of anti-viral drugs (180). Later studies revealed that cardio-vascular risk was in fact larger in untreated compared to treated HIV infection (181, 182), but also in patients on cART, the risk for cardiovascular disease is higher than expected based on traditional cardiovascular risk factors alone. In addition to cardiovascular disease, HIV infection poses patients at increased risk to develop a number of other non-AIDS related complications, such as non-alcoholic steatohepatitis, renaldys function, osteoporosis, insulin resistance, metabolic syndrome, and cognitive impairment (47). It has been shown that soluble mediators released by activated immune cells, such as IL-6, IL-1, and TNFa, also act on non-immune tissue cells with various tissue-dependent pathological effects.In a broad variety of clinical conditions, including obesity, atherosclerosis, neurodegenerative disease, and autoimmune diseases, chronic inflammatory processes are now recognized to play a major role (183), and it has been postulated that most non-AIDS defining complications of HIV infection are related to the chronic inflammatory state induced by HIV (Figure 1) (184, 185). This hypothesis is strengthened by recent observations inpatients with rheumatoid arthritis (RA). Both HIV infection and RA are characterized by a chronic inflammatory state and increased levels of pro-inflammatory cytokines like TNFa, IL-1b, and IL-6, and also in RA patients the incidence of non-primary disease related complications such as cardiovascular disease, osteoporosis, non-alcoholic fatty liver disease (NAFLD), and cognitive impairment are more prevalent than among the general population (125, 126127). Thus, clinical symptoms that initially seemed unrelated are now being recognized as part of the total complex of HIV-associated disease and appear to have a common underlying pathogenesis of chronic inflammation and excessive immune activation (186,187). Preliminary data suggest a central role for TNFa in HIV-associated non-AIDS disease but it remains to be determined to what extent other pro-inflammatory cytokines, perhaps acting via TNFa, are involved.

HIV IN COMPARISON TO OTHER PERSISTENT VIRAL INFECTIONS

These novel insights into HIV pathogenesis prompt the question as to how HIV differs from most other viruses.We believe that HIV pathogenesis is caused by a combination of specific characteristics. Most importantly HIV infects CD4+ T helper cells.In addition a variety of cells that express CD4 and one of the HIV core ceptors can be infected abeit at very low levels.Thereby, the
virus is not confined to a single organ and may induce a variety of systemic immune responses. HIV induces  much higher
levels of cytokines during acute infection compared to hepatitis B orhepatitis C (41). HIV is virtually insensitive to control by neutralizing antibodies and cellular immunity because of various mechanisms, including the glycanshield surrounding the HIV
virion (188) and the high mutation rate of the virus, which allows for rapid immune escape. After acute HIV infection, virus-and
host-specific setpoints are established that determine the subsequent clinical course based on the level, and probably the type, of immune activation that is induced. Like other viruses, HIV induces type I IFN release by pDCs. The fact that HIV is targeted to pDCs by virtue of their expression of CD4, and the recent finding that HIV does not induce full maturation of pDC, which prevents these cells to become refractory to restimulation, as outlined above (86), may turn out to be critical factors driving persistent IFN release and thereby chronic activation of the innate and the adaptive immune system in HIV patients, resulting in exhaustion of immunity and broad spectrum end-organ immune pathology. Even though immune responses in acute hepatitis B and hepatitis C virus infection may differ from those in acute HIV infection, in individuals who do not clear hepatitis viral infection and who convert to chronic hepatitis, persistently increased immune activation levels have been reported. In analogy to what is observed in HIV patients, also non-hepatitis related conditions, such as metabolic syndrome and cardiovascular disease, occur more frequently in chronic hepatitis patients than in the general population, even when corrected for traditional risk factors for, e.g.,cardiovascular disease (189). Strikingly, peripheral blood naiveT-cell numbers in chronic hepatitis C virus-infected patients were found to be significantly lower than in healthy individuals, and associated with increased levels of inflammation (190). Thus, while immune responses during acute infection may differ between HIV, hepatitis B, and hepatitis C virus infection, leading to clearance of the virus in the majority of hepatitis B infected patients and a subset of hepatitis C infected individuals, once chronic inflammation has been established, its effects tend to be similar for the three patient groups. Other viral infections, like Epstein–Barr virus (EBV) and cytomegalo virus (CMV) are incomparable to HIV or chronic hepatitis infection,because after an acute phase these infections convert into a truly latent stage, during which no virus is detectable in the peripheral blood of immunocompetent individuals.

THE IMMUNEACTIVATIONHYPOTHESISREDUCEDTO PRACTICE

BOOSTING IMMUNITY

Great effort has been put over the years into approaches to therapeutically strengthen anti-viral immune responses.Thus far,
however, there is little proof for beneficial effects, and in fact the possibility of induction of adverse effects is an important concern Therapeutic vaccination, with DNA and live viral vector-based vaccines and combinations thereof, has had only transient and small effects on viral load (191, 192). In one trial in which therapeutic vaccination was followed by interruption of cART, viral rebound was larger and time to restart therapy shorter, than in the non-vaccinated group (193). With respect to prophylactic vaccines, CTL-based vaccines may have some potential if they manage to consistently lower the viral setpoint. However, upon infection such vaccines will at best reduce and not completely prevent chronic immune activation driven pathogenesis and should therefore not be considered  as curative. In fact, the strongest protective effect is to be expected from HIV vaccines that stimulate HLA-B57,  B58, or B27 restricted T-cell responses, as they are associated with significantly lower viral loads. Such vaccines would however only help the carriers of protective HLA molecules, most of which already experience much slower disease progression upon HIV infection. In order to develop CTL vaccines that are applicable to a wider patient population it is of vital importance to gain better insight into the mechanisms responsible for the relative protection conferred by these protective HLA molecules.
 
For boosting of immunity and enhancement of CD4+ T-cell production, IL-2 has been administered in large scale multi-center
international trials in patients with and without cART with substantial increases in CD4+ T-cell counts but no beneficial clinical
effects (194). Administration of IL-7 (195, 196) or human growth hormone (197) has been tried out in small cohorts, with successful effects on naive and central memory T-cell numbers, but again without significant clinical effects. As these biological compounds are known to have strong activating effects on the peripheral T-cell compartment (198) their administration is not without risk, and one should be aware of possible adverse effects in the long run. Immune stimulating therapy should in any case be restricted to patients on cART, although even on cART (residual) immune activation is correlated with poor immune reconstitution (199). To enhance anti-HIV responses, blockade of inhibitory ligand-receptor interactions, such as PD-1, CTLA-4, and Tim-3, has been proposed (200). Some positive results have been obtained with PD-1 blockade in SIV-infected macaques, which has been shown to lead to improved virus-specific CD8C T-cell responses, reduction in plasma viral load and prolonged survival (201), and to reduced hyperactivation and bacterial translocation (202). However, experiments with CTLA-4 blockade have demonstrated that the effects of inhibitory receptor blockade may even be deleterious, leading to increased T-cell activation and viral replication (203). Great care therefore needs to be taken with approaches that may increase the level of CD4+ T-cell proliferation, and in our opinion should never be applied without cART.
 
Taken together, therapeutic interventions aiming at enhancing anti-HIV T-cell immunity may not have the desired beneficial effect in the majority of people, and may even have adverse long-term effects through the immune stimulation they induce. It has been argued that since our understanding of virus-specific cellular immunity–and in particular its repertoire, its functional and kinetic requirements, and its regulation and tissue distribution–are still far from complete, the real correlate of immune protection against AIDS is still to be discovered (204). Indeed, not all immune activation needs to be equally pathogenic,and we cannot exclude the possibility that induction of HIV-specific T-cell responses without excessive and chronic release of type I IFNs and other cytokines might be favorable to the host for control of HIV. However,as pDC activation is believed to be required for the induction of an adequate adaptive T-cell response, induction of strong HIV-specific immune responses without chronic release of type I IFNs may be an impossible combination; in fact, pDC activation may collaterally cause the very same pathology that the adaptive immune response should prevent. Irrespective of the hypothesis of what is causing AIDS pathogenesis, of all vaccination strategies, prophylactic vaccines that are able to induce a strong broadly neutralizing antibody response at this time seem to be most promising to induce protective immunity to HIV infection in a large number of individuals (205).

THERAPEUTIC DAMAGE CONTROL

Another correlate of the immune activation hypothesis is that immune suppressive therapy might have beneficial clinical effects
because it reduces the deleterious effects of immune activation.Immune suppressive drugs like cyclosporin (206, 207) and
mycophenolic acid (208), that are used to prevent T-cell activation in organ transplant rejection, have been experimentally tried in HIV infection. In combination with cART, variable effects on T-cell turnover, activation, and CD4+ T-cell numbers were shown (206208).
 
Given the recent insight that not activation of  CD4+ and CD8CT cells via TCR, but instead TLR activation, release of type IIFNs
and expression of IFNa/b inducible genes may contribute more to systemic immune activation in HIV infection, the latter proteins
and genes may be more relevant targets for therapeutic interventions (Figure 1). TLR antagonists and inhibitors are recurrently an
area of intense investigation and it is to be expected that many will become available for phaseI/II or experimental proof of concept clinical trials in the very near future (209, 210). Indeed, in a preliminary study in which chloroquine, aninhibitorofendosomal TLR3,7,8, and 9 was administered to HAART-naive HIV-infected patients, significantly lower immune activation levels were observed, as reflected by decreased levels of T-cell division and expression of activation markers (124). Although these findings need to be reconfirmed and more clinical studies are needed, this study suggests that interference with HIV-induced TLR7/9 activation is feasible. Because of the clear association between immune activation and clinical outcome such interventions may be promising. Also IRF-7,which selectively induces IFNa but not TNFa or IL-12 production, is a potential drug target and treatment with IFNa neutralizing antibodies or blocking TNFa or the TNFa-R are feasible options to be explored in order to decrease inflammation and tissue-related pathology.Indeed,targeting TNFa in pathogenic SIV infection in RMs by administration of adalimumab (Humira) has been shown to reduce systemic inflammation and many of its down-stream effects (211).
 
Humanized anti-IFNa monoclonal anti-bodies have been developed and have been tested in phase I trials in patients suffering from systemic lupus erythematosus (SLE) and psoriasis,autoimmune diseases in which IFNa is believed to play a critical role. In SLE but not psoriasis one dose of anti-IFNa monoclonal anti-body resulted in down regulation ofI FN-inducible gene expression with beneficial clinical effects (212, 213). No evidence for adverse effects, such as an increase in vira l infections or viral reactivation was observed which opens up the possibility to consider application of anti-IFNa treatment to HIV-infected patients to neutralize over-expression of IFNa. Induction of anti-IFNa anti-bodies by immunization within activated IFNa to inhibit pro-gression to AIDS has been investigated in a large multi-centre study,and beneficial effects on CD4+ T-cell decline and markers of clinical progression were reported in patients that developed anti-IFNa antibodies(214). Although these studies have never been repeated, the recently obtained insights into the role of IFNa in HIV-disease progression warrant future research in this direction.
 
Paradoxically, IFNa administration has been investigated in the pre-cART era as a treatment option for HIV infection with or without Kaposisarcoma (215, 216). Although IFNa treat ment showed the expected anti-viral effect, leading to lower viral loads, this type of treatment became of less interest when cART became available. In addition, IFNa treatment induced flu-like syndrome, immune activation, and T-cell depletion when given to HIV patients co-infected with HCV (217, 218219).
 
In RA patients who were treated with TNFa inhibiting agents (such as infliximab or etanercept) it was shown that blocking the effect of TNFa reversed the increased incidence of cardiovascular complications and insulin resistance (125, 126, 127).Anecdotal reports have shown the safety of anti-TNFa treatment in RA patients who were also HIV infected and on HAART (220). Anon-specific intervention aimed at lowering immune activation and its side effects, such as cardiovascular disease, might be the addition of statins to standard anti-retroviral regimens, as has been suggested for treatment of RA (221).
 
In addition to potentially improving HIV-treatment options, the interventions suggested above will provide us with a wealth of data allowing dissection of the relative contribution of different cytokines such as IFNa and TNFa to immune activation and end-organ immune pathology in HIV infection. It should be noted however that, given the complex interrelationship between potentially protective immune responses and the damage induced by chronic immune activation, any of these interventions could in principle also aggravate HIV-induced pathology.Therefore, a combination with HAART seems at this time the best approach.
 

CONCLUSION

Were view compelling evidence for CD4+ T-cell loss in HIV infection caused by various down-stream effects of persistent and
strong innate immune activation.Immune activation is induced by HIV ssRNA and possibly its DNA intermediates and to some
extent by translocation of bacterial products from the gut.This CD4+ T-cell death is occurring in addition to CD4+ T-cell loss
due to direct HIV-induced cell killing.We conclude that immune activation is most likely the main cause of CD4+ T-cell depletion,
loss of HIV-specific immunity and HIV-associated non-AIDS disease, also in patients on cART.Although much knowledge is still
lacking, we are beginning to understand which receptors and active molecules are most likely dominant in the cellular  and molecular pathways  involved in HIV pathology.This new perspective has major implications for HIV vaccinology, but also opens up novel therapeutic options that may be explored in the near future.
 
Search strategy and selection criteria:references for this article were identified through searches of PubMed for articles published from 1985, by use of the terms HIV, SIV, AIDS, immune activation, immunity, pathogenesis. Articles resulting from these searches and relevant references cited in those articles were reviewed. Articles published in English were included.

ACKNOWLEDGMENTS

Our work is supported by the Dutch Aids Fonds (to Kiki Tesse-laar), the Netherlands Organization for Scientific Research (now, grant 917.96.350 to Mette D. Hazenberg, 836.07.002 to José A.M. Borghans, and 016.048.603 to Rob J. deBoer), and by a Utrecht University High Potential Grant (to Debbievan Baarle).

REFERENCES

1

Fauci AS, Pantaleo G, Stanley S, Weissman D. Immunopathogenic mechanisms of HIV infection. Ann Intern Med (1996) 124: 654–63. doi:10.7326/0003-4819-124-7-199604010-00006

2

Cooper A, Garcia M, Petrovas C, Yamamoto T, Koup RA, Nabel GJ.HIV-1causes CD4 cell death through DNA-dependent protein kinase during viral integration. Nature (2013) 498(7454):376–9.doi:10.1038/nature 12274

3

Douek DC, Brenchley JM, Betts MR, Ambrozak DR, Hill BJ, Okamoto Y,et al. HIV preferentially infects HIV-specific  CD4+Tcells. Nature (2002) 417(6884):95–8. doi:10.1038/417095a

4

Doitsh G, Cavrois M, Lassen KG, Zepeda O,Yang Z, Santiago ML, et al. Abortive HIV infection mediates CD4T cell depletion and inflammation in human lym-phoidtissue. Cell (2010) 143(5): 789–801. doi:10.1016/j.cell.2010. 11.001

5

Finkel TH, Tudor-Williams G, Banda NK, Cotton MF, Curiel T, Monks C,et al. Apoptosis occurs predominantly in bystander cells and not in productively infected cells of HIV-and SIV-infected lymphnodes. NatMed (1995) 1:129–34. doi:10.1038/nm0295-129

6

Meyaard L,Otto SA, Jonker RR, Mijnster MJ, Keet RPM, Miedema F. Programmed death of Tcells in HIV-1infection. Science (1992) 257:217–9.doi:10. 1126/science.1352911

7

Pitcher CJ, Quittner C, Peterson DM, Connors M, Koup RA, Maino V, et al. HIV-1-specificCD4+T cells are detectable in most individuals with active HIV-1 infection, but decline with prolonged viral suppression. NatMed (1999) 5(5):518–25.doi:10.1038/8400

8

Altman JD, Moss PAH, Goulder  P JR, Barouch DH, McHeyzer-Williams MG, Bell JI, et al. Phenotypic analysis of antigen-specific Tlymphocytes. Science (1996) 274:94–6. doi:10.1126/science.274.5284.94

9

Ogg GS, Jin X, Bonhoeffer S, Moss P, Nowak M, Monard S, et al.Decay kinetics of human immunodeficiency virus-specific effect or cytotoxic Tlymphocytes after combination anti retro-viral therapy. J Virol (1999) 73: 797–800.

10

Leslie AJ, Pfafferott KJ, Chetty P,  Draenert R, Addo MM, Feeney M, et al.HIV evolution:CTL escape mutation and reversion after transmission. NatMed (2004) 10(3):282–9.doi:10.1038/nm992

11

De Boer RJ, Mohr iH,Ho DD, Perelson AS.Turn over rates of B cells,T cells, and NK cells in simian immunodeficiency virus-infected and uninfected rhesus macaques. J Immunol (2003) 170(5): 2479–87.

12

Draenert R, Verrill CL, Tang Y, Allen TM, Wurcel AG, Boczanowski M, et al. Persistent recognition of autologous virus by high-avidity CD8T cells in chronic, progressive human immunodeficiency virus type1 infection. J  Virol (2004) 78(2):630–41.doi:10.1128/JVI.78. 2.630-641.2004

13

Hellerstein MK, Hoh RA, Hanley MB, Cesar D, Lee D, Neese RA, et al. Subpopulations of long- lived and short-lived T cells in advanced HIV-1 infection. J  Clin Invest (2003) 112(6):956–66.doi: 10.1172/JCI17533

14

Grossman Z, Paul WE. The impact of HIV on naïve T-cell homeo stasis. Nat Med (2000) 6:976–7.doi: 10.1038/79667

15

Hazenberg MD, Stuart JW, Otto SA, Borleffs JC, Boucher CA, de Boer RJ, et al. Tcell division in human immunodeficiency virus (HIV-1)-infection is mainly due to immune activation: a longitudinal analysis in patients before and during highly active ant-retroviral therapy. Blood (2000) 95(1):249–55.

16

Mohri H, Perelson AS, Tung K, Ribeiro RM, Ramratnam B, Markowitz M, et al. Increased turn over of Tlymphocytesin HIV- 1 infection and its reduction by anti-retroviral therapy. J  Exp Med (2001) 194:1277–87.doi:10.1084/ jem.194.9.1277

17

Kovacs JA, Lempicki RA, Sidorov IA, Adelsberger JW, Herpin B, Met- calf JA, et al. Identification of dynamically distinct subpopulations of Tlymphocytesthatare differentially affected by HIV. J  Exp Med (2001) 194:1731–41.doi:10. 1084/jem.194.12.1731

18

Lempicki RA, Kovacs JA, Baseler MW, Adelsberger JW, Dewar RL, Natarajan V, et al. Impact of HIV-1 infection and highly active anti- retroviral therapy on the kinetics of CD4+ and CD8+ T cell turnover in HIV-infected patients. Proc Natl Acad Sci USA (2000) 97:13778–83.doi:10.1073/ pnas.250472097

19

Fahey JL, Taylor JMG, Detels R, Hofmann B, Melmed R, Nishanian P, et al. Theprognostic value of cellular and serologic markers in infection with human immunodeficiency virus type1. N  Engl J Med (1990) 322:166–72.doi:10. 1056/NEJM199001183220305

20

Hofmann B, Wang YX, Cumberland WG, Detels R, Bozorgmehri M, Fahey JL. Serum beta2-microglobulin level increases in HIV infection:relation to seroconversion, CD4T-cell fall and prognosis. AIDS (1990) 4(3):207–14.doi:10.1097/00002030-199003000-00005

21

Bofill M, Mocroft A, Lipman M, Medina E, Borthwick NJ, Sabin CA, et al. Increased numbers of primed activated CD8+CD38+CD45RO+ T cells predict the decline of CD4+ T cells in HIV-1-infected patients. AIDS (1996) 10:827–34.doi:10. 1097/00002030-199607000-00005

22

Giorgi JV, Liu Z, Hultin LE, Cumberland WG, Hennessey K, Detels R. Elevated levels of CD38+CD8+ T cells in HIV infection add to the prognostic value of low CD4+ T cell levels:results of 6 years of follow-up.The Los Angeles Center, Multicenter AIDS Cohort Study. J  AcquirImmune Defic Syndr (1993) 6:904–12.

23

Lien E, Aukrust P, Sundan A, Muller F, Froland SS, Espevik T. Elevated levels of serum-soluble CD14 in human immunodeficiency virus type1(HIV-1) infection:correlation to disease progression and clinical events. Blood (1998) 92(6):2084–92.

24

Godfried MH, vander Poll T, Jansen J, Romijin JA, Schattenkerk JK, Endert E, et al. Soluble receptors for tumour necrosis factor: a putative marker of disease progression in HIV infection. AIDS (1993) 7(1):33–6.doi:10.1097/00002030-199301000-00005

25

Zangerle R, Fuchs D, Sarcletti M, Gallati H, Reibnegger G, Wachter H, et al. Increased concentrations of soluble tumor necrosis factor receptor 75 but not of soluble intercellular adhesion molecule-1 are associated with the decline of CD4+ lymphocytes in HIV infection. Clin ImmunolImmunopathol (1994) 72(3):328–34.doi:10.1006/clin.1994.1149

26

Zangerle R, Fuchs D, Reibnegger G, Fritsch P, Wachter H. Markers for disease progression in intravenous drug users infected with HIV-1. AIDS (1991) 5(8):985–91.doi:10. 1097/00002030-199108000-00010

27

von Sydow M, Sonnerborg A, Gaines H, Strannegard O. Interferon-alpha and tumor necrosis factor-alpha in serum of patients in various stages of HIV-1 infection. AIDS Res Hum Retroviruses (1991) 7(4):375–80. doi:10.1089/aid.1991.7.375

28

Giorgi JV, Hultin LE, McKeating JA, Johnson TD, Owens B, Jacobson LP, et al. Shorter survival in advanced Human Immunodeficiency virus type1 infection is more closely associated withTlymphocyte activationthan with plasma virus burden or virus chemokineco receptor usage. J Infect Dis (1999) 179:859–70.doi:10.1086/314660

29

Zangerle R, Steinhuber S, Sarcletti M, Dierich MP, Wachter H, Fuchs D, et al. Serum HIV-1 RNA levels compared to soluble markers of immune activation to predict disease progression in HIV-1-infected individuals. Int Arch Allergy Immunol (1998) 116(3):228–39.
doi:10.1159/000023949

30

Liu Z, Cumberland WG, Hultin LE, Kaplan AH, Detels R, Giorgi JV. CD8+ T lymphocyte activation in HIV-1 disease reflects an aspect of pathogenesis distinct from viral burden and immunodeficiency. J AcquirImmune Defic Syndr (1998) 18:332–40.doi:10. 1097/00042560-199808010-00004

31

Hazenberg MD, Otto SA, van Benthem BH, Roos MT, Coutinho RA, Lange JM, et al. Persistent immune activation in HIV-1
infection is associated with progression to AIDS. AIDS (2003) 17(13):1881–8.doi:10.1097/00002030-200309050-00006

32

Deeks SG, Kitchen CM, Liu L, Guo H, Gascon R, Narvaez AB, et al. Immune activation setpoint during early HIV infection predicts
subsequent CD4+T-cell changes independent of viral load. Blood (2004) 104(4):942–7.doi:10.1182/blood-2003-09-3333

33

Marlink R, Kanki P, Thior I,Travers K, Eisen G, Siby T, et al. Reduced rate of disease development after HIV-2 infection as compared to HIV-1. Science (1994) 265:1587–90.doi:10.1126/science. 7915856

34

Schramm B, Penn ML, Palacios EH, Grant RM, Kirchhoff F, Goldsmith MA. Cytopathicity of human immunodeficiency virus type 2 (HIV-2) in human lymphoid tissue is coreceptor dependent and comparable to that of HIV-1. J  Virol (2000) 74:9594–600.doi:10.1128/JVI.74.20.9594-9600.2000

35

Grossman Z, Meier-Schellersheim M, Sousa AE, Victorino RMM, Paul WE. CD4T-cell depletion in HIV infection: are we closer
to understanding the cause? Nat Med (2002) 8:319–23.doi:10. 1038/nm0402-319

36

Sousa AE, Carneiro J, Meier-Schellersheim M, Grossman Z, Victorino RM. CD4T cell deple tion is linked directly to immune activation in the pathogenesis of HIV-1 and HIV-2 but only indirectly to the viral load. J  Immunol (2002) 169(6):3400–6.

37

Chakrabarti LA, Lewin SR, Zhang L, Gettie A, Luckay A, Martin LN, et al. Normal Tcell turnover in sooty mangabeys harboring active simian immunodeficiency virus infection. J  Virol (2000) 74(3):1209–23.doi:10.1128/JVI. 74.3.1209-1223.2000

38

Silvestri G, Sodora DL, Koup RA, Paiardini M, O’Neil SP, McClure HM, et al. Nonpathogenic SIV infection of sooty mangabeys is characterized by limited bystander immunopathology despite chronic high-leve lviremia. Immunity (2003) 18(3):441–52.doi:10.1016/S1074-7613(03)00060-8

39

Hazenberg MD, Hamann D, Schuitemaker H, Miedema F. T cell depletion in HIV-1 infection: how CD4+ T cells go out of stock.
Nat Immunol (2000) 1(4):285–9. doi:10.1038/79724

40

Pedersen C, Lindhardt BO, Jensen BL, Lauritzen E, Gerstoft J, Dickmeiss E, et al. Clinical course of primary HIV infection: consequences for subsequent course of infection. BMJ (1989) 299(6692): 154–7. doi:10.1136/bmj.299.6692. 154

41

Stacey AR, Norris PJ, Qin L, Haygreen EA, Taylor E, Heitman J, et al. Induction of a striking systemic cytokine cascade prior to peak viremia in acute human immunodeficiency virus type 1 infection, in contrast to more modest and delayed responses in acute hepatitis Band C virus infections. J  Virol (2009) 83(8):3719–33.doi: 10.1128/JVI.01844-08

42

Gaines H, von Sydow MA, von Steding kLV, Biberfeld G, Bottiger B, Hansson LO, et al. Immunological changes in primary HIV-1 infection. AIDS (1990) 4(10):995–9.doi:10.1097/ 00002030-199010000-00008

43

Douek DC, Picker LJ, Koup RA. T cell dynamics in HIV-1 infection 14. Annu Rev Immunol (2003) 21:265–304.doi:10.1146/annurev.
immunol.21.120601.141053

44

Clark DR, De Boer RJ, Wolthers KC, Miedema F. T cell dynamics in HIV-1 infection. Adv Immunol (1999) 73:301–27.doi:10.1016/S0065-2776(08)60789-0

45

Elbim C, Pillet S, Prevost MH, Preira A, Girard PM, Rogine N, et al.Redox and activation status of monocytes from human immunodeficiency virus-infected patients: relationship with viral load. J  Virol (1999) 73(6):4561–6.

46

Elbim C, Prevot MH, Bouscarat F, Franzini E,Chollet-Martin S, Hakim J, et al. Polymorphonuclear neutrophils from human immunodeficiency virus-infected patients show enhanced activation, diminished fMLP-induced L-select in shedding, and an impaired oxidative burst after cytokine priming. Blood (1994) 84(8):2759–66.

47

Deeks SG. HIV infection, inflammation, immunosenescence, and aging. Annu Rev Med (2011) 62:141–55.doi:10.1146/annurev-
med-042909-093756

48

Veazey RS, De Maria M, Chalifoux LV, Shvetz DE, Pauley DR, Knight HL, et al. Gastrointestinal tract as a major site of CD4+ T cell depletion and viral replication in SIV infection. Science (1998) 280(5362):427–31.doi:10. 1126/science.280.5362.427

49

Brenchley JM, Schacker TW, Ruff LE, Price DA, Taylor JH, Beilman GJ, et al.CD4+T cell depletion during all stages of HIV
disease occurs predominantly in the gastrointestinalt ract. J  Exp Med (2004) 200(6):749–59.doi:10. 1084/jem.20040874

50

Li Q, Duan L, Estes JD, Ma ZM, Rourke T, Wang Y, et al. Peak SIV replication in resting memory CD4+T cells depletes gutlamina propria CD4+T cells. Nature (2005) 434(7037):1148–52.

51

Brenchley JM, Price DA, Schacker TW, Asher TE, Silvestri G, Rao S, et al. Microbial translocation is a cause of systemic immune activation in chronic HIV infection. Nat Med (2006) 12(12):1365–71. doi:10.1038/nm1511 

52

Gordon SN, Cervasi B, Odorizzi P, Silverman R, Aberra F, Ginsberg G, et al. Disruption of intestinal CD4+T cell homeostasis is a key marker of systemic CD4+ T cell activation in HIV- infected individuals. J  Immunol (2010) 185(9):5169–79.doi:10.
4049/jimmunol.1001801

53

Brenchley JM, Price DA, Douek DC. HIV disease:fall out from a mucosal catastrophe? Nat  Immunol (2006) 7(3):235–9. doi:10.1038/ni1316

54

Gordon SN, Klatt NR, Bosinger SE, Brenchley JM, Milush JM, Engram JC, et al. Severe depletion of mucosal CD4+T cells in AIDS-free simian immunodeficiency virus-infected sooty mangabeys. J  Immunol (2007) 179(5):3026–34.

55

PandreaI V, Gautam R, Ribeiro RM, Brenchley JM, ButlerI F, Pattison M, et al. Acute loss of intestinal CD4+T cells is not predictive of simian immunodeficiency virus virulence. J  Immunol (2007) 179(5):3035–46.

56

Breed MW, Jordan AP, Aye PP, Lichtveld CF, Midkiff CC, Schiro FR, et al. Loss of atyrosine-dependent trafficking motif in the simian immunodeficiency virus envelope cytoplasmic tail spares mucosal CD4 cells but does not prevent disease progression. J  Virol (2013) 87(3):1528–43.doi: 10.1128/JVI.01928-12

57

Marchetti G, Bellistri GM, Borghi E, Tincati C, Ferramosca S, La FM, et al. Microbial translocation is associated with sustained failure in CD4+T-cell reconstitution in HIV-infected patients on long-term highly active anti-retroviral therapy. AIDS (2008) 22(15):2035–8.doi:10.1097/QAD. 0b013e3283112d29

58

Deeks SG, Phillips AN. HIV infection, antiretroviral treatment, ageing, and non-AIDS related morbidity. BMJ (2009) 338:a3172.doi:10.1136/bmj.a3172

59

Kalayjian RC, Machekano RN, Rizk N, Robbins GK, Gandhi RT, Rodriguez BA, et al. Pretreatment levels of soluble cellular receptors and interleukin-6 are associated with HIV disease progression in subjects treated with highly active antiretroviral therapy. J Infect Dis (2010) 201(12):1796–805.doi:10. 1086/652750

60

Jiang W, Lederman MM, Hunt P, Sieg SF, Haley K, Rodriguez B, et al. Plasma levels of bacterial DNA cor- relate with immune activation and the magnitude of immune restoration in persons with antiretroviral- treated HIV infection. J  Infect Dis (2009) 199(8):1177–85.doi:10.1086/597476

61

Cassol E, Malfeld S, Mahasha P, vander Merwe MS, Cassol S, Seebregts C, et al. Persistent microbial translocation and immune activation in HIV-1-infected South Africans receiving combination antiretroviral therapy. J Infect  Dis (2010) 202(5):723–33.doi:10. 1086/655229

62

Wallet MA, Rodriguez CA, Yin L, Saporta S, Chinratanapisit S, Hou W, et al. Microbial translocation induces persistent macrophage activation unrelated to HIV-1 levels or T-cell activation following therapy. AIDS (2010) 24(9):1281–90.doi:10.1097/QAD. 0b013e328339e228

63

Baroncelli S, Galluzzo CM, Pirillo MF, Mancini MG, Weimer LE, Andreotti M, et al. Microbial translocation is associated with residual viral replication in HAART-treated HIV+subjects with <50 copies/ml HIV-1RNA. J  Clin Virol (2009) 46(4):367–70. doi:10.1016/j.jcv.2009.09.011

64

Brenchley JM, Paiardini M, Knox KS, Asher AI, Cervasi B, AsherTE, et al. Differential Th17 CD4 T-cell depletion in pathogenic and non-pathogenic lentiviral infections. Blood (2008) 112(7):2826–35.doi: 10.1182/blood-2008-05-159301

65

Littman DR, Rudensky AY.Th17 and regulatory T cells in mediating and restraining inflammation. Cell (2010) 140(6):845–58.doi:10.
1016/j.cell.2010.02.021

66

Favre D, Lederer S, Kanwar B, Ma ZM, Proll S, Kasakow Z, et al. Critical loss of the balance between Th17 and T regulatory cell populations in pathogenic SIV infection. PLoS Pathog (2009) 5(2):e1000295.doi: 10.1371/journal.ppat.1000295

67

Raffatellu M, Santos RL, Verhoeven DE, George MD, Wilson RP, Winter SE, et al. Simian immunodeficiency virus-induced mucosal interleukin-17 deficiency promotes Salmonella dissemination from the gut. Nat Med (2008) 14(4):421–8.doi:10.1038/nm1743

68

Micci L, Cervasi B, Ende ZS, Iriele RI, Reyes-Aviles E, Vinton C, et al. Paucity of IL-21-producing CD4(+) T cells is associated with Th17 cell depletion in SIV infection of rhesus macaques. Blood (2012) 120(19):3925–35.doi:10. 1182/blood-2012-04-420240

69

Pallikkuth S, Micci L, Ende ZS, Iriele RI, Cervasi B,Lawson B, et al. Maintenance of intestinal Th17 cells and reduced microbial translocation in SIV-infected rhesus macaques treated with interleukin (IL)-21. PLoS Pathog (2013) 9(7):e1003471.doi: 10.1371/journal.ppat.1003471

70

Heil F, Hemmi H, Hochrein H, Ampenberger F, Kirschning C, Akira S, et al. Species-specific recognition of single-stranded RNA viat oll-like receptor  7 and 8. Science (2004) 303(5663):1526–9. doi:10.1126/science.1093620 

71

Diebold SS, Kaisho T, Hemmi H, Akira S, Reise-Sousa C. Innate antiviral responses by means of TLR7-mediated recognition of single-stranded RNA. Science (2004) 303(5663):1529–31. doi:10.1126/science.1093616

72

Beignon AS, McKenna K, Skoberne M, Manches O, Da Silva I, Kavanagh DG, et al.Endocytosis of HIV-1 activates plasmacy to iddendritic cells via toll-like receptor-viral RNA interactions. J  Clin Invest (2005) 115(11):3265–75. doi:10.1172/JCI26032

73

Meier A, Alter G, Frahm N, Sidhu H, Li B, Bagchi A, et al. My D88-dependent immune activation mediated by human immunodeficiency virus type1-encoded toll-like receptor ligands. J Virol (2007) 81(15):8180–91.doi: 10.1128/JVI.00421-07

74

Fonteneau JF, Larsson M, Beignon AS, McKenna K, Da Silva I, Amara A, et al.Human immunodeficiency virus type1 activates plasmacytoid dendritic cells and concomitantly induces the bystander maturation of myeloidden dritic cells. J  Virol (2004) 78(10):5223–32.doi:10. 1128/JVI.78.10.5223-5232.2004

75

Hardy AW, Graham DR, Shearer GM, Herbeuval JP.HIV turns plasmacy to iddendritic cells (pDC) into TRAIL-expressing killer pDC
and down-regulates HIV coreceptors by toll-like receptor 7-induced IFN-alpha. Proc Natl Acad Sci US A (2007) 104(44):17453–8.doi:10.1073/pnas.0707244104

76

Alter G, Suscovich TJ, Teigen N, Meier A, Streeck H, Brander C, et al. Single-stranded RNA derived from HIV-1 serves as a potent activator of  NK cells. J  Immunol (2007) 178(12):7658–66.

77

Sedaghat AR, German J, Teslovich TM, Cofrancesco JJr, Jie CC, Talbot CCJr, et al. Chronic CD4+ T-cell activation and depletion in human immunodeficiency virus type1 infection: type I interferon-mediated disruption of T-cell dynamics. J  Virol (2008) 82(4):1870–83.doi: 10.1128/JVI.02228-07

78

Hyrcza MD, Kovacs C, Loutfy M, Halpenny R, Heisler L, Yang S, et al. Distinct transcriptional profiles in exvivo CD4and CD8T cells are established early in human immunodeficiency virus type 1 infection and are characterized by a chronic interferon response as well as extensive transcriptional changes in CD8+T cells. J  Virol (2007) 81(7):3477–86. doi:10.1128/JVI.01552-06

79

Cohen-Stuart JW, Hazenberg MD, Hamann D, Otto SA, Borleffs JC, Miedema F, et al. The dominant source of CD4+and CD8+ T-cell activation in HIV infection is antigenic stimulation. J  AcquirImmune Defic Syndr (2000) 25(3):203–11.doi:10.1097/ 00126334-200011010-00001

80

Bucy RP, Hockett RD, Derdeyn CA, Saag MS, Squires K, Sillers M, et al.Initial increase in blood CD4+ lymphocytes after HIV anti-
retroviral therapy reflects redistribution from lymphoid tissues. J Clin Invest (1999) 103(10):1391–8. doi:10.1172/JCI5863

81

Wolf K,Tsakiris DA, Weber R, Erb P, Battegay M. Antiretroviral therapy reduces markers of endothelial and coagulation activation in patients infected with human immunodeficiency virus type1. J  Infect Dis (2002) 185(4):456–62. doi:10.1086/338572

82

Theofilopoulos AN, Baccala R, Beutler B, Kono DH. Type I interferons (alpha/beta) inimmunity and autoimmunity. Annu Rev Immunol (2005) 23:307–36. doi:10.1146/annurev.immunol.23. 021704.115843 

83

Jacquelin B, Mayau V, Targat B, Liovat AS, Kunkel D, Petitjean G, et al.Nonpathogenic SIV infection of African green monkeys induces a strong but rapidly controlled type I IFN response. J Clin Invest (2009) 119(12):3544–55.doi:10. 1172/JCI40093

84

Ito T, Kanzler H, Duramad O, Cao W, Liu YJ. Specialization, kinetics, and repertoire of type 1 interferon responses by human plasmacytoid predendritic cells. Blood (2006) 107(6):2423–31.doi: 10.1182/blood-2005-07-2709

85

Bjorck P. Dendritic cells exposed to herpes simplex virus in vivo do not produce IFN-alpha after rechallenge with virus in vitro and exhibit decreased Tcell alloreactivity. J  Immunol (2004) 172(9):5396–404.

86

O’ Brien M, Manches O, Sabado RL, Baranda SJ, Wang Y, Marie I, et al. Spatio-temporal trafficking of HIV in human plasmacytoid dendritic cells defines a persistently IFN-alpha-producing and partially matured phenotype. J  Clin
Invest (2011) 121(3):1088–101. doi:10.1172/JCI44960

87

Rey-Cuille MA, Berthier JL, Bomsel-Demontoy MC, Chaduc Y, Montagnier L, Hovanessian AG, et al. Simian immunodeficiency
virus replicates to high levels in sooty mangabeys without inducing disease. J  Virol (1998) 72(5):3872–86.

88

Hartung S, Boller K, Cichutek K, Norley SG, Kurth R. Quantitation of a lentivirus in its natural host: simian immunodeficiency virus in
African green monkeys. J  Virol (1992) 66(4):2143–9.

89

Goldstein S, Ourmanov I, Brown CR, Beer BE, Elkins WR, Plishka R, et al. Wide range of viral load in healthy African green monkeys naturally infected with simian immunodeficiency virus. J  Virol (2000) 74(24):11744–53.doi:10. 1128/JVI.74.24.11744-11753.2000

90

Diop OM, Gueye A, Dias-Tavares M, Kornfeld C, Faye A, Ave P, et al. High levels of viral replication during primary simian immunodeficiency virus SIVagm infection are rapidly and strongly controlled in African green monkeys. J  Virol (2000) 74(16):7538–47.doi:10.1128/JVI.74.16.7538-7547.2000

91

Broussard SR, Staprans SI, White R, Whitehead EM, Feinberg MB, Allan JS. Simian immunodeficiency virus replicates to high
levels in naturally infected African green monkeys without inducing immunologic or neurologic disease. J Virol (2001) 75(5):2262–75.doi:10.1128/JVI.75.5.2262-2275. 2001

92

Dunham R, Pagliardini P, Gordon S, Sumpter B, Engram J, Moanna A, et al.The AIDS resistance of naturally SIV-infected sooty mangabeys is independent of cellular immunity to the virus. Blood (2006) 108(1):209–17.doi: 10.1182/blood-2005-12-4897

93

Li B, Stefano-Cole K, Kuhrt DM, Gordon SN, Else JG, Mulenga J, et al. Nonpathogenic simian immunodeficiency virus infection of sooty mangabeys is not associated with high levels of autologous neutralizing antibodies. J  Virol (2010) 84(12):6248–53.doi:10.1128/JVI.00295-10

94

Sodora DL, Allan JS, Apetrei C, Brenchley JM, Douek DC, Else JG, et al.Toward an AIDS vaccine:lessons from natural simian immunodeficiency virus infections of African nonhuman primate hosts. Nat Med (2009) 15(8):861–5.doi:10.1038/nm.2013

95

Pandrea I, Gaufin T, Brenchley JM, Gautam R, Monjure C, Gautam A, et al. Cutting edge:experimentally induced immune activation in natural hosts of simian immunodeficiency virus induces significant increases in viral replication and CD4+ T cell depletion. J Immunol (2008) 181(10):6687–91.

96

Canary LA, Vinton CL, Morcock DR, Pierce JB, Estes JD, Brenchley JM, et al. Rate of AIDS progression is associated with gastrointestinal dysfunction in simian immunodeficiency virus-infected pigtail macaques. J
Immunol (2013) 190(6):2959–65. doi:10.4049/jimmunol.1202319

97

Milush JM, Reeves JD, Gordon SN, Zhou D, Muthukumar A, Kosub DA, et al. Virally induced CD4+ T cell depletion is not sufficient to induce AIDS in  anatural host. J  Immunol (2007) 179(5): 3047–56.

98

Barry AP, Silvestri G, Safrit JT, Sumpter B, Kozyr N, McClure HM, et al. Depletion of CD8+ cells in sooty mangabey monkeys naturally infected  withsimian immunodeficiency virus reveals limited rolef or immune control of virus replication in a natural
host species. J  Immunol (2007) 178(12):8002–12. 

99

Gordon SN, Dunham RM, Engram JC, Estes J, Wang Z, Klatt NR, et al. Short-lived infected cells support virus replication in sooty
mangabeys naturally infected with simian immunodeficiency virus: implications for AIDS pathogenesis. J  Virol (2008) 82(7):
3725–35. doi:10.1128/JVI.02408-07

100

Kornfeld C, Ploquin MJ, PandreaI, Faye A, Onanga R, Apetrei C,etal. Antiinflammatory profiles during primary SIV infection in African
green monkeys are associated with protection against AIDS. J Clin invest (2005) 115(4):1082–91.doi:10.1172/JCI200523006

101

Silvestri G, Feinberg MB. Turnover of lymphocytes and conceptual paradigms in HIV infection. J Clin Invest (2003) 112(6):821–4.doi: 10.1172/JCI200319799

102

Mand lJN, Barry AP, Vanderford TH, Kozyr N, Chavan R, Klucking S, et al. Divergent TLR7 and TLR9 signaling and type I inter-
feron production distinguish pathogenic and nonpathogenic AIDS virus infections. Nat Med (2008) 14(10):1077–87.doi:10.1038/nm.
1871

103

Mandl JN, Akondy R, Lawson B, Kozyr N, Staprans SI, Ahmed R, et al. Distinctive TLR7 signaling, type I IFN production, and attenu-ated innate and adaptive immune responses to yellow fever virus in a primate reservoir host. J  Immunol (2011) 186(11):6406–16.doi:10.4049/jimmunol.1001191

104

Bosinger SE, Li Q, Gordon SN, Klatt NR, Duan L, Xu L, et al.Global genomic analysis reveals rapid control of a robust innate response in SIV-infected sooty mangabeys. J Clin Invest (2009) 119(12):3556–72. doi:10.1172/JCI40115

105

Harris LD, Tabb B, Sodora DL, Paiardini M, Klatt NR, Douek DC, et al.Down-regulation of robust acute type I IFN responses distinguishes non-pathogenic SIV infection of natural hosts from pathogenic SIV infection of rhesus macaques. J  Virol (2010) 84(15):7886–91.doi:10.1128/JVI. 02612-09

106

Lederer S, Favre D, Walters KA, Proll S, Kanwar B, Kasakow Z, et al.Transcriptional profiling in pathogenic and non-pathogenic SIV infections reveals significant distinctions in kinetics and tissue compartmentalization. PLoS Pathog (2009) 5(2): e1000296. doi:10.1371/journal. ppat.1000296

107

Estes JD, Gordon SN, Zeng M, Chahroudi AM, Dunham RM, Staprans SI, et al. Early resolution of acute immune activation and induction of PD-1 in SIV-infected sooty mangabeys distinguishes nonpathogenic from pathogenic infection in rhesus macaques. J Immunol (2008) 180(10):6798–807.

108

Meythaler M, Martinot A, Wang Z, Pryputniewicz S, Kasheta M, Ling B, et al.Differential CD4+ T-lymphocyteapoptosis and bystander T-cellactivation in rhesus macaques and sooty mangabeys during acute simian immunodeficiency virus infection. J Virol (2009) 83(2):572–83. doi:10.1128/JVI.01715-08.

109

Manches O, Bhardwaj N. Resolution of immune activation defines nonpathogenic SIV infection. J Clin Invest (2009) 119(12):3512–5. doi:10.1172/JCI41509

110

Brenchley JM, Silvestri G, Douek DC. Nonprogressive and progressive primate immunodeficiency lentivirus infections. Immunity
(2010) 32(6):737–42.doi:10.1016/j.immuni.2010.06.004

111

Estes JD, Harris LD, Klatt NR, Tabb B, Pittaluga S, Paiardini M, et al. Damaged intestinal epithelial integrity linked to microbial
translocation in pathogenic simian immunodeficiency virus infections. PLoS Pathog (2010) 6(8):e1001052. doi:10.1371/journal.
ppat.1001052

112

Paiardini M, Cervasi B, Reyes-Aviles E, Micci L, Ortiz AM, Chahroudi A, et al.Low lev-els of SIV infection in sooty mangabey central memory CD4(+) T-cells are associated with limited CCR5 expression. Nat Med (2011) 17(7):830–6. doi:10.1038/nm.2395

113

Beaumier CM, Harris LD, Goldstein S, Klatt NR, Whitted S, McGinty J, et al.CD4 down-regulation by memory CD4+T cells in vivo renders African green monkeys resistant to progressive SIVagm infection. Nat Med (2009) 15(8):879–85.doi:10.1038/nm.1970

114

Ganesan A, Chattopadhyay PK,  Brodie TM, Qin J, Gu W, Mascola JR, et al.Immuno-logic and virologic events in early HIV infection predict subsequentrate of progression. J Infect Dis (2010) 201(2):272–84.doi:10.1086/649430

115

Schneider DS, Ayres JS. Two ways to survive infection: what resistance and tolerance can teach us about treating
infectious diseases. Nat Rev Immunol (2008) 8(11):889–95. doi:10.1038/nri2432

116

Medzhitov R. Damage control in host-pathogen interactions. Proc Natl Acad Sci U S A (2009) 106(37):15525–6. doi:10.1073/pnas.0908451106

117

Choudhary SK, Vrisekoop N, Jansen CA, Otto SA, Schuitemaker H, Miedema F, et al. Low immune activation despite high levels of
pathogenic human immunodeficiency virus type1 results in long-term asymptomatic disease. J  Virol (2007) 81(16):8838–42.doi:
10.1128/JVI.02663-06

118

Rotger M, Dalmau J, Rauch A, McLaren P, Bosinger SE, Martinez R, et al. Comparative transcriptomics of extreme phenotypes of
human HIV-1 infection and SIV infection in sooty mangabey and rhesus macaque. J  ClinInvest (2011) 121(6):2391–400.doi:10.
1172/JCI45235

119

Hardy GA, Sieg S, Rodriguez B, Anthony D, Asaad R, Jiang W, et al. Interferon-alpha is the primary plasma type-I IFNinHIV-
1 infection and correlates with immune activation and disease markers. PLoS One (2013) 8(2): e56527. doi:10.1371/journal.pone.
0056527

120

Meier A, Chang J, Chan ES, Pollard RB, Sidhu HK, Kulka-rni S,et al. Sex differences in the toll-like receptor-mediated response of plasmacytoid dendritic cells to HIV-1. Nat Med (2009) 15(8):955–9.doi:10.1038/nm.2004

121

Chang JJ, Woods M, Lindsay RJ, Doyle EH, Griesbeck M, Chan ES, et al. Higher expression of several interferon-stimulated genes
in HIV-1-infected females after adjusting for the level of viral replication. J Infect Dis (2013) 208(5):830–8.doi:10.1093/infdis/jit262

122

Sterling TR, Vlahov D, Astemborski J, Hoover DR, Margolick JB, Quinn TC. Initial plasma HIV-1 RNA levels and progression to
AIDS in women and men. N  Engl J Med (2001) 344(10):720–5.doi:10.1056/NEJM200103083441003

123

Chang J, Lindsay RJ, Kulkarni S, Lifson JD, Carrington M, Altfeld M. Polymorphisms in interferon regulatory factor 7 reduce interferon-alpha responses of plasmacytoid dendritic cells to HIV-1.AIDS (2011) 25(5):715–7.doi:10.1097/QAD.0b013e328343c186

124

Murray SM, Down CM, Boulware DR, Stauffer WM, Cavert WP, Schacker TW, et al. Reduction of immune activation with chloroquine therapy during chronic HIV infection. J  Virol (2010) 84(22):12082–6.doi:10.1128/JVI. 01466-10

125

Gonzalez-Gay MA, Gonzalez-Juanatey C, Vazquez-Rodriguez TR, Miranda-Filloy JA, Llorca J. Insulin resistance in rheumatoid
arthritis:the impact of the anti-TNF-alpha therapy. Ann NY Acad Sci (2010) 1193:153–9.doi:10. 1111/j.1749-6632.2009.05287.x

126

Moreland LW, Curtis JR.Systemic non articular manifestations of rheumatoid arthritis: focus on inflammatory mechanisms.
Semin Arthritis Rheum (2009) 39(2):132–43.doi:10.1016/j. semarthrit.2008.08.003

127

McKellar GE, McCarey DW, Sattar N, McInnes IB. Role for TNF in atherosclerosis? Lessons from autoimmune disease. Nat Rev
Cardiol (2009) 6(6):410–7.doi:10. 1038/nrcardio.2009.57

128

Roederer M, Gregson Dubs J, Anderson MT, Raju PA, Herzenberg LA, Herzenberg L. CD8 naive T cell counts decrease progressively in HIV-infected adults. J Clin Invest (1995) 95:2061–6.doi:10.1172/JCI117892

129

Picker LJ, Hagen SI, Lum R, Reed-Inderbitzin EF, Daly LM, Sylwester AW, et al. Insufficient production and tissue delivery of CD4+ memory Tcells in rapidly progressive simian immunodeficiency virus infection. J  Exp Med (2004) 200(10):1299–314.doi:10. 1084/jem.20041049

130

Ribeiro RM, Mohri H, Ho DD, Perelson AS. Invivo dynamics of Tcell activation, proliferation, and death in HIV-1 infection: why
are CD4+ but not CD8+  T cells depleted? Proc Natl Acad Sci US A (2002) 99(24):15572–7.doi:10. 1073/pnas.242358099

131

Clark DR, Repping S, Pakker NG, Prins JM, Notermans DW, Wit FW, et al.T-cell progenitor function during progressive human immunodeficiency virus-1 infection and after antiretroviral therapy. Blood (2000) 96(1):242–9.

132

Schacker TW, Nguyen PL, Beilman GJ, Wolinsky S, Larson M, Reilly C, et al. Collagen deposition in HIV-1infected lymphatic tissues andT cell homeostasis. J  Clin Invest (2002) 110(8):1133–9. doi:10.1172/JCI200216413

133

Estes JD, Wietgrefe S, Schacker T, Southern P, Beilman G, Reilly C, et al.Simian immunodeficiency virus-induced lymphatic
tissue fibrosis is mediated by transforming growth factor beta1-positive regulatory Tcells and begins in early infection. J Infect Dis (2007) 195(4):551–61.doi:10. 1086/510852

134

Zeng M, Smith AJ, Wietgrefe SW, Southern PJ, Schacker TW, Reilly CS, et al. Cumulative mechanisms of lymphoid tissue fibrosis and T cell depletion in HIV-1 and SIV infections. J Clin Invest (2011) 121(3):998–1008.doi:10. 1172/JCI45157

135

Day CL, Kaufmann DE, Kiepiela P, Brown JA, Moodley ES, Reddy S, et al. PD-1 expression on HIV-specific T cells is associated with T-cell exhaustion and disease progression. Nature (2006) 443(7109):350–4.doi:10. 1038/nature05115

136

Trautmann L, Janbazian L, Chomont N, Said EA, Gimmig S, Bessette B, et al.Upregulation of PD-1 expressionon HIV-specific CD8+ T cells leads to reversible immune dysfunction. Nat Med (2006) 12(10):1198–202. doi:10.1038/nm1106-1329b

137

Kaufmann DE, Kavanagh DG, Pereyra F, Zaunders JJ, Mackey EW, Miura T, et al. Upregulation of CTLA-4 by HIV-specific CD4+ T cells correlates with disease progression and defines a reversible immune dysfunction. Nat  Immunol (2007) 8(11):1246–54. doi:10.1038/ni1515

138

Terpstra FG, Al BJ, Roos MT, De Wolf F, Goudsmit J, Schellekens PT, et al. Longitudinal study of leukocyte functions in homosexual mensero converted for HIV-1: rapid and persistent loss of B-cell function after HIV-1 infection. Eur JImmunol (1989) 19:667–73.
doi:10.1002/eji.1830190415

139

Lane HC, Depper JL, Greene WC, Whalen G, Waldmann TA, Fauci AS. Qualitative analysis of immune function in patients with the acquired immunodeficiency syndrome. N Engl J Med (1985) 313:79–84.doi:10.1056/NEJM198507113130204

140

Martinez-Maza O, Crabb E, Mitsuyasu RT, Fahey JL, Giorg iJV. Infection with the human immuno deficiency virus (HIV) is associated with invivo increase in B lymphocyte activation and immaturity. J  Immunol (1987) 138: 3720–4.

141

Cagigi A, Nilsson A, DeMilito A, Chiodi F. B cell immunopathology during HIV-1 infection: lessons to learn for HIV-1 vaccine design.
Vaccine (2008) 26(24):3016–25. doi:10.1016/j.vaccine.2007.11.063

142

Moir S, Fauci AS. B cells in HIV infection and disease. Nat Rev Immunol (2009) 9(4):235–45.doi: 10.1038/nri2524

143

Boliar S, Murphy MK, Tran TC, Carnathan DG, Armstrong WS, Silvestri G, et al.B-lymphocyte dysfunction in chronic HIV-1 infection does not prevent cross-clade neutralization breadth. J  Virol (2012) 86(15):8031–40.doi: 10.1128/JVI.00771-12

144

Zhang Z, Xu X, Lu J, Zhang S, Gu L, Fu J, et al. Band T lymphocyte attenuator down-regulation by HIV-1 depends on type Iinterferon and contributes to T-cell hyperactivation. J Infect Dis (2011) 203(11):1668–78.doi: 10.1093/infdis/jir165

145

Cubas RA, Mudd JC, Savoye AL, Perreau M, van Grevenynghe J, Metcalf T, et al.Inadequate T follicular cell help impairs B cell immunity during HIV infection. Nat  Med (2013) 19(4):494–9.doi:10. 1038/nm.3109

146

Younes SA, Yassine-Diab B, Dumont AR, Boulassel MR, Grossman Z, Routy JP, et al. HIV-1 viremia prevents the establishment of interleukin 2-producing HIV-specific memory CD4+ T cells endowed with proliferative capacity. J Exp Med (2003) 198(12):1909–22. doi:10.1084/jem.20031598

147

McNeil AC, Shupert WL, Iyasere CA, Hallahan CW, Mican JA, Davey RTJr, et al. High-level HIV-1 viremia suppresses viral antigen-
specific CD4(+) T cell proliferation. Proc Natl Acad Sci US A (2001) 98(24):13878–83.doi:10.1073/pnas.251539598

148

Kaufman D, Lichterfeld M, Altfeld M, Allen TM, Johnston M, Lee P, et al. Limited durability of immune control following
treated acute HIV infection. PLoS Med (2004) 1(2):e36.doi:10.1371/journal.pmed.0010036

149

Jansen CA, DeCuyperI M, Steingrover R, Jurriaans S, Sankatsing SUC, Prins JM, et al. Analysis of the effect of highly active anti-
retro viral therapy during acute HIV-1 infection on HIV-specific CD4+ T-cell functions. AIDS (2005) 19:1145–54.doi:10.1097/
01.aids.0000176214.17990.94

150

Jansen CA, Piriou E, De Cuyper IM, van Dort K, Lange JM, Miedema F, et al. Long-term highly active anti retroviral therapy
in chronic HIV-1 infection:evidence for reconstitution of antiviral immunity. Antivir Ther (2006) 11(1):105–16.

151

Harari A, Petitpierre S, Vallelian F, Pantaleo G. Skewed representation of functionally distinct populations of virus-specific CD4
T cells in HIV-1-infected subjects with progressive disease: changes after antiretroviral therapy. Blood (2004) 103(3):966–72.
doi:10.1182/blood-2003-04-1203

152

Harari A, Vallelian F, Pantaleo G. Phenotypic heterogeneity of antigen-specific CD4 T cells under different conditions of antigen persistence and antigenl oad. Eur J Immunol (2004) 34(12):3525–33. doi:10.1002/eji.200425324

153

Harari A, Vallelian F, Meylan PR, Pantaleo G. Functional heterogeneity of memory CD4T cell responses indifferent conditions of antigen exposure and persistence. J Immunol (2005) 174(2):1037–45.

154

Migueles SA, Laborico AC, Shupert WL, Sabbaghian MS, Rabin R, Hallahan CW, et al. HIV-specific CD8+T cell proliferation is coupled to perforin expression and is maintained in non-progressors. Nat Immunol (2002) 3(11):1061–8.doi:10.1038/ni845

155

Champagne P, Ogg GS, King AS, Knabenhans C, Ellefsen K, Nobile M, et al.Skewed maturation of memory HIV-specific CD8T lymphocytes. Nature (2001) 410:106–11.doi:10.1038/35065118

156

Betts MR, Krowka JF, Kepler TB, Davidian M, Christopherson C, Kwok S, et al. Human immuno deficiency virus type1-specific
cytotoxic Tlymphocyteactivity is inversely correlated with HIV type1 viral load in HIV type 1-infected long-term survivors. AIDS Res Hum Retroviruses (1999) 15(13):1219–28.doi:10. 1089/088922299310313

157

Betts MR, Ambrozak DR, Douek DC, Bonhoeffer S, Brenchley JM, Casazza JP, et al. Analysis of total human immunodeficiency
virus(HIV)-specific CD4(+) and CD8(+) T-cell responses:relation-ship to viral load in untreated HIV infection. J  Virol (2001)
75(24):11983–91.doi:10.1128/JVI. 75.24.11983-11991.2001

158

Appay V, Dunbar PR, Callan M, Klenerman P, Gillespie GM, Papagno L, et al.Memory CD8+ T cells vary in differentiation
phenotype in different persistent virus infections. Nat Med (2002) 8(4):379–85.doi:10.1038/nm0402-379

159

van Baarle D,Tsegaye A, Miedema F, Akbar A. Significance of senes-cence for virus-specific memory T cell responses:rapid ageing  during chronic stimulation of the immune system. Immunol Lett (2005) 97(1):19–29.doi:10.1016/j. imlet.2004.10.003

160

Klein MR, van Baalen CA, Holwerda AM, Kerkhof-Garde SR, Bende RJ, KeetI PM, et al. Kinetics of Gag-specific CTL responses
during the clinical course of HIV-1 infection: a longitudinal analysis of rapid progressors and long-term asymptomatics. J Exp Med
(1995) 181:1365–72.doi:10.1084/jem.181.4.1365

161

Pereyra F, Jia X, McLaren PJ, Telenti A, de Bakker PI, Walker BD, et al.The major genetic determinants of HIV-1 control affect
HLA class I peptide presentation. Science (2010) 330(6010):1551–7. doi:10.1126/science.1195271

162

Borghans JA, Molgaard A, de Boer RJ, Kesmir C. HLA alleles associ-ated with slow progression to AIDS truly prefer to present HIV-1 p24. PLoS One (2007) 2(9):e920.doi:10. 1371/journal.pone.0000920

163

Dahirel V, Shekhar K, Pereyra F, Miura T, Artyomov M, Talsania S, et al. From the cover:coordinate linkage of HIV evolution reveals
regions of immunological vulnerability. Proc Natl Acad Sci USA (2011) 108(28):11530–5.doi:10. 1073/pnas.1105315108

164

Buseyne F, Le CJ, Corre B, Porrot F, Burgard M, Rouzioux C, et al. Inverse correlation between memory Gag-specific cytotoxic T
lymphocytes and viral replication in human immunodeficiency virus-infected children. J  Infect Dis (2002) 186(11):1589–96.doi:10.
1086/345482

165

Edwards BH, Bansal A, Sabbaj S, Bakari J, Mulligan MJ, Goepfert PA. Magnitude of functional CD8+T-cell responses to the gag protein of human immunodeficiency virus type1correlates inversely with viral load in plasma. J Virol (2002) 76(5):2298–305.doi:
10.1128/jvi.76.5.2298-2305.2002

166

Kiepiela P, Ngumbela K, Thobakgale C, Ramduth D, Honeyborne I, Moodley E, et al. CD8+T-cell responses to different HIV proteins have discordant associations with viral load. Nat Med (2007) 13(1):46–53.doi:10.1038/nm1520

167

Masemola A, Mashishi T, Khoury G, Mohube P, Mokgotho P, Vardas E, et al. Hierarchical targeting of subtype C human immunodeficiency virus type1proteins by CD8+T cells:correlation with viral load. J  Virol (2004) 78(7):3233–43.doi:10.1128/JVI.
78.7.3233-3243.2004

168

Novitsk yV, Gilbert P, Peter T, McLane MF, Gaolekwe S, Rybak N, e tal. Association between virus-specific T-cell responses and
plasma viral load in human immunodeficiency virus type1 subtype C infection. J  Virol (2003) 77(2):882–90.doi:10.1128/JVI.77.
2.882-890.2003

169

Zuniga R, Lucchetti A, Galvan P, Sanchez S, Sanchez C, Hernandez A, et al. Relative dominance of Gag p24-specific cytotoxicTlymphocytesis associated with human immunodeficiency virus control. J  Virol (2006) 80(6):3122–doi:10. 1128/JVI.80.6.3122-3125.2006

170

Prince JL, Claiborne DT, Carlson JM, Schaefer M, Yu T, Lahki S, et al. Role of transmitted Gag CTL polymorphisms indefining replicative capacity and early HIV-1 pathogenesis. PLoS Pathog (2012) 8(11):e1003041.doi:10. 1371/journal.ppat.1003041

171

Goepfert PA, Lumm W, Farmer P, Matthews P, Prendergast A, Carlson JM, et al.Transmission of HIV-1 Gag immune escape mutations is associated with reduced viral load in linked recipients. J  Exp Med (2008) 205(5):1009–17. doi:10.1084/jem.20072457

172

Schellens IM, Borghans JA, Jansen CA, De CuyperI M, Geskus RB, van Baarle D, et al. Abundance of early functional HIV-specific
CD8+ T cells does not predict AIDS-free survival time. PLoS One (2008) 3(7):e2745.doi:10.1371/journal.pone.0002745

173

Jansen CA, De CuyperI M, Hooibrink B, vande rBij AK, van Baarle D, Miedema F. Prognostic value of HIV-1 Gag-specific CD4+ T-cell responses for progression to AIDS analysed in a prospective cohort study. Blood (2005) 107(4):1427–33.doi:10. 1182/blood-2005-07-2907

174

Brumme Z, Wang B, Nair K, Brumme C, de Pierres C, Reddy S, et al. Impact of select immunologic and virologic biomarkers on CD4 cell count decrease in patients with chronic HIV-1 sub-type C infection:results from Sinikithemba Cohort, Durban, South Africa. Clin Infect Dis (2009) 49(6):956–64.doi:10.1086/605503

175

Jansen CA, van Baarle D, Miedema F. HIV-specific CD4+T cells and viremia:who’s in control? Trends Immunol (2006) 27(3):119–24.
doi:10.1016/j.it.2006.01.004

176

Henry K, Melroe H, Huebsch J, Hermundson J, Levine C, Swensen L,et al.Severe pre-mature  coronary artery disease with protease inhibitors. Lancet (1998) 351(9112):1328.doi:10. 1016/S0140-6736(05)79053-X

177

Holmberg SD, Moorman AC, Williamson JM, Tong TC, Ward DJ, Wood KC, et al.Protease inhibitors and cardiovascular outcomes in patients with HIV-1. Lancet (2002) 360(9347):1747–8. doi:10.1016/S0140-6736(02) 11672-2

178

Mary-Krause M, Cotte L, Simon A, Partisani M, Costagliola D. Increased risk of myocardial infarction with duration of
protease inhibitor therapy in  HIV-infected men. AIDS (2003) 17(17):2479–86.doi:10.1097/00002030-200311210-00010

179

Friis-Moller N, Sabin CA, Weber R, d’Arminio MA, El-Sadr WM, Reiss P, et al.Combination anti-retroviral therapy and the risk of
myocardial infarction. N Engl J Med (2003) 349(21):1993–2003. doi:10.1056/NEJMoa030218

180

Friis-Moller N, Reiss P, Sabin CA, Weber R, Monforte A, El-Sadr W, et al. Class of antiretroviral drugs and the risk of myocardial infarction. N Engl J Med (2007) 356(17):1723–35.doi:10. 1056/NEJMoa062744

181

Bozzette SA, Ake CF, Tam HK, Chang SW, Louis TA. Cardio-vascular and cerebro-vascular events in patients treated for
human immunodeficiency virus infection. N Engl J Med (2003) 348(8):702–10. doi:10.1056/NEJMoa022048

182

El-Sadr WM, Lundgren JD, Neaton JD, Gordin F, Abrams D, Arduino RC, et al.CD4count-guided interruption of
antiretroviral treatment. N  Engl J Med (2006) 355(22):2283–96. doi:10.1056/NEJMoa062360 

183

Medzhitov R. Inflammation 2010: new adventures of an old flame. Cell (2010) 140(6):771–6.doi:10. 1016/j.cell.2010.03.006

184

Hsue PY, Hunt PW, Sinclair E, Bredt B, Franklin A, Killian M, e tal. Increased carotid intima-media thickness in HIV patients is associated with increased cytomegalovirus-specificT-cell responses. AIDS (2006) 20(18):2275–83.doi:10.1097/
QAD.0b013e3280108704

185

Hsue PY, Hunt PW, Schnell A, Kalapus SC, Hoh R, Ganz P, et al. Role o fviral replication, anti-retroviral therapy, and immunodeficiency in HIV-associated atherosclerosis. AIDS (2009) 23(9):1059–67.doi:10.1097/QAD. 0b013e32832b514b

186

Tebas P, Henry WK, Matining R, Weng-Cherng D, Schmitz J, Valdez H, et al.Metabolic and immune activation effects of treatment interruption in chronic HIV-1 infection: implications for cardiovascular risk. PLoS One (2008) 3(4):e2021.doi:10.1371/
journal.pone.0002021

187

Phillips AN, Neaton J, Lundgren JD.The role of HIV in serious diseases other than AIDS. AIDS (2008) 22(18): 2409–18. doi:10.1097/QAD.0b013e3283174636

188

Wei X, Decker JM, Wang S, Hui H, KappesJ C, Wu X, et al. Antibody neutralization and escape by HIV-1. Nature (2003) 422(6929):307–12.doi:10. 1038/nature01470

189

Targher G, Bertolini L, Padovani R, Rodella S, Arcaro G, Day C. Differences and similarities in  early atherosclerosis between patients with non-alcoholic steatohepatitis and chronic hepatitis BandC. J  Hepa tol (2007) 46(6):1126–32.doi:10. 1016/j.jhep.2007.01.021

190

YonkersNL,SiegS,Rodriguez B,AnthonyDD.Reducednaive CD4 Tcellnumbersandimpaired inductionofCD27inresponseto
T cellreceptorstimulationreflect a stateofimmuneactivationin chronichepatitisCvirusinfection. J InfectDis (2011) 203(5):635–45.
doi:10.1093/infdis/jiq101

191

Levy Y, Gahery-Segard H, Durier C, Lascaux AS, Goujard C, Meiffredy V, et al. Immunological and virological efficacy of a therapeutic immunization combined with interleukin-2 in chronically HIV-1 infected patients. AIDS (2005) 19(3):279–86.

192

LevyY,DurierC,LascauxAS, MeiffredyV,Gahery-SegardH, GoujardC,etal.Sustainedcon-trolofviremiafollowingtherapeu-
ticimmunizationinchronically HIV-1-infectedindividuals. AIDS (2006) 20(3):405–13.doi:10.1097/01.aids.0000206504.09159.d3

193

Autran B, Murphy RL, Costagliola D,Tubiana R, Clotet B, Gatell J, et al. Greater viral rebound and reduced time to resume
antiretroviral therapy after therapeutic immunization with the ALVAC-HIV vaccine (vCP1452). AIDS (2008) 22(11):1313–22.doi:
10.1097/QAD.0b013e3282fdce94

194

Abrams D, Levy Y, Losso MH, Babiker A, Collins G, Cooper DA, et al. Interleukin-2 therapy in patients with  HIVinfection. N Eng lJ Med (2009) 361(16): 1548–59. doi:10.1056/ NEJMoa0903175

195

Levy Y, Lacabaratz C,Weiss L, Viard JP, Goujard C, Lelievre JD, et al. Enhanced Tcell recovery in HIV-1-infected adults through
IL-7 treatment. J Clin Invest (2009) 119(4):997–1007.doi:10. 1172/JCI38052

196

Sportes C, Hakim FT, Memon SA, Zhang H, ChuaK S, Brown MR, et al. Administration of rhIL-7 in humans increases invivo TCR
repertoire diversity by preferential expansion of naive T cell subsets. J Exp Med (2008) 205(7):1701–14. doi:10.1084/jem.20071681

197

Napolitano LA, Schmidt D, Gotway MB, Ameli N, Filbert EL, Ng MM, et al. Growth hormone enhances thymic function in HIV-
1-infected adults. J Clin Invest (2008) 118(3):1085–98.

198

Sereti I, Dunham RM, Spritzler J, Aga E, Proschan MA, Medvik K, et al. IL-7 administration drives T cell-cycle entry and expansion in HIV-1 infection. Blood (2009) 113(25): 6304–14. doi:10.1182/blood- 2008-10-186601

199

Deeks SG, Barbour JD, Grant RM, Martin JN. Duration and predictors of CD4 T-cell gains in patients who continue combination therapy despite detectable plasma viremia. AIDS (2002) 16(2):201–7.doi:10.1097/00002030-200201250-00009

200

Kaufmann DE, Walker BD. PD-1 and CTLA-4 inhibitory cosignaling pathways in HIV infection and the potential for therapeutic intervention. J  Immunol (2009) 182(10):5891–7.doi:10. 4049/jimmunol.0803771

201

Velu V, Titanji K, Zhu B, Husain S, Pladevega A, Lai L, et al. Enhancing SIV-specific immunity in vivo by PD-1 blockade. Nature
(2009) 458(7235):206–10.doi:10. 1038/nature07662

202

Dyavar SR, Velu V, Titanji K, Bosinger SE, Freeman GJ, Silvestri G, et al.PD-1 blockade during chronic SIV infection reduces hyperimmune activation and microbial trans location in rhesus macaques. J Clin  Invest (2012) 122(5):1712–6.doi:
10.1172/JCI60612

203

Cecchinato V, Tryniszewska E, Ma ZM, Vaccari M, Boasso A,Tsai WP, et al.Immune activation driven by CTLA-4 blockade augments viral replication at mucosal sites in simian immunodeficiency virus infection. J I mmunol (2008) 180(8):5439–47.

204

Virgin HW,Walker BD. Immunology and the elusive AIDS vaccine. Nature (2010) 464(7286):224–31. doi:10.1038/nature08898

205

Burton DR, Weiss RA.AIDS/HIV. A boost for HIV vaccine designScience (2010) 329(5993):770–3. doi:10.1126/science.1194693

206

Rizzardi GP, Harari A, Capiluppi B, Tambussi G, Ellefsen K, Ciuffreda D, et al.Treatment of primary HIV-1 infection with cyclosporin
A coupled with highly active antiretro viral therapy. J Clin Invest (2002) 109(5):681–8.doi:10.1172/ JCI0214522

207

Markowitz M, Vaida F, Hare CB, Boden D, Mohri H, Hecht FM, et al.The virologic and immunologic effects of cyclosporineasan adjunct to antiretroviral therapy in patients treated during acute and early HIV-1 infection. J Infect  Dis (2010) 201(9):1298–302.doi: 10.1086/651664

208

Vrisekoop N, Sankatsing SU, Jansen CA, Roos MT, Otto SA, Schuitemaker H, et al. Short communication:no detrimental immunological effects of mycophenolate mofetiland HAART in treatment-naive acute and chronic HIV-1-infected patients. AIDS Res  HumRetro- viruses (2005) 21(12):991–6. doi:10.1089/aid.2005.21.991

209

Hennessy EJ, Parker AE, O’Neill LA.Targeting toll-like receptors: emerging therapeutics? Nat Rev Drug Discov (2010) 9(4):293–307.
doi:10.1038/nrd3203

210

Hedayat M, Netea MG, Rezaei N. Targeting of toll-like receptors: a decade of progress in combating infectious diseases. Lancet Infect Dis (2011) 11(9):702–12.doi:10. 1016/S1473-3099(11)70099-8

211

Tabb B, Morcock DR, Trubey CM, Quinones OA,Hao XP, Smedley J, et al. Reduced inflammation and lymphoid tissue immunopathology in rhesus macaques receiving anti-tumor necrosis factor treatment during primary simian immunodeficiency virus infection. J Infect Dis (2013) 207(6):880–92. doi:10.1093/infdis/jis643

212

Bissonnette R, Papp K, Maari C, Yao Y, Robbie G, White WI, et al. Arandomized, double- blind, placebo-controlled, phase
I study of MEDI-545, ananti- interferon-alfa monoclonal anti- body, in subjects with chronic psoriasis. J Am Acad Dermatol
(2010) 62(3):427–36.doi:10.1016/j.jaad.2009.05.042

213

MerrillJT,WallaceDJ,Petri M, KirouKA,YaoY,White WI, etal.Safetyprofileand clinicalactivityofsifalimumab, a fullyhumananti-interferon
{alpha}monoclonalantibody,in systemiclupuserythematosus:a phaseI,multicentre,double-blind randomisedstudy. AnnRheum
Dis (2011) 70(11):1905–13.doi: 10.1136/ard.2010.144485

214

Gringeri A, Musicco M, Hermans P, Bentwich Z, Cusini M, Bergamasco A, et al. Active anti- interferon-alpha immunization: a European-Israeli, randomized, double-blind, placebo-controlled clinical trial in 242 HIV-1–infected patients (the EURIS study). J AcquirImmune Defic Syndr Hum Retrovirol (1999) 20(4):358–70.doi:10.1097/00042560-199904010-00006

215

TavelJA,HuangCY,ShenJ,Met-calf JA,DewarR,ShahA,etal. Interferon-alphaproducessignifi-cant decreasesinHIVload. J Inter-
feronCytokineRes (2010) 30(7): 461–4. doi:10.1089/jir.2009.0090

216

Kovacs JA, Deyton L, Davey R, Falloon J, Zunich K, Lee D, et al. Combined zidovudine and interferon-alpha therapy in patients with Kaposis arcoma and the acquired immunodeficiency syndrome(AIDS). AnnIntern Med (1989) 111(4):280–7.doi:10.7326/0003-4819-111-4-280

217

Pesce A, Taillan B, Rosenthal E, Garnier G, Vinti H, Dujardin P, et al. Opportunistic infections and CD4 lympho cytopenia with interferon treatment in HIV-1 infected patients. Lancet (1993) 341(8860):1597.doi:10. 1016/0140-6736(93)90736-Z

218

Landau A, Batisse D, Duong Van Huyen JP, Piketty C, Bloch F, Pialoux G, et al. Efficacy and safety of combination therapy
with interferon-alpha 2 band ribavirin for chronic hepatitisC in HIV-infected patients. AIDS (2000) 14(7):839–44.doi:10.1097/
00002030-200005050-00010

219

Arizcorreta A, Marquez M, Fernandez-Gutierrez C, Guzman EP, Brun F, Rodriguez-Iglesias M, et al.T cell receptor excision circles
(TRECs), CD4+, CD8+, and their CD45RO+, and CD45RA+, sub- populations in hepatitis C virus (HCV)-HIV-co-infected patients
during treatment with interferon alpha plus ribavirin:analysis in a population on effective antiretro- viral therapy. Clin Exp Immunol
(2006) 146(2):270–7.

220

Cepeda EJ, Williams FM, Ishimori ML, Weisman MH, Reveille JD.The use of anti-tumour necrosis factor therapy in HIV-positive individuals with rheumatic disease. Ann Rheum Dis (2008) 67(5):710–2. doi:10.1136/ard.2007.081513

221

Bisoendial RJ, Stroes ES, Kastelein JJ,Tak PP. Targeting cardiovascular risk in rheumatoid arthritis:a dual role for statins. Nat Rev Rheumatol (2010) 6(3):157–64.doi:10.1038/nrrheum.2009.277

Other

Conflict of Interest Statement: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
 
Received:18 July 2013; accepted: 09 September 2013; published online: 26 September 2013.
 
Citation: Miedema F, Hazenberg MD, Tesselaar K, van Baarle D, de Boer R J and Borghans JAM (2013) Immune activation and collateral damage in AIDS pathogenesis. Front. Immunol. 4: 298. doi: 10.3389/fimmu. 2013.00298 This article was submitted to HIV and AIDS, a section of the journal Frontiers in Immunology.
 
Copyright©2013 Miedema, Hazenberg, Tesselaar, van Baarle, de Boerand Borghans.This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author (s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

NEXT

Page statistics
5493 view(s) and 191 edit(s)
Social share
Share this page?

Tags

This page has no custom tags.
This page has no classifications.

Comments

You must to post a comment.

Attachments