MakiPan commited on
Commit
006dceb
·
1 Parent(s): 9e5b250

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +2 -36
app.py CHANGED
@@ -261,6 +261,7 @@ To preprocess the data there were three options we considered:
261
  </td>
262
  </tr></table>
263
  </center>
 
264
  - The last option was to use [MediaPipe Holistic](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html) to provide pose face and hand landmarks to the ControlNet. This method was promising in theory, however, the HaGRID dataset was not suitable for this method as the Holistic model performs poorly with partial body and obscurely cropped images.
265
 
266
  We anecdotally determined that when trained at lower steps the encoded hand model performed better than the standard MediaPipe model due to implied handedness. We theorize that with a larger dataset of more full-body hand and pose classifications, Holistic landmarks will provide the best images in the future however for the moment the hand encoded model performs best. """)
@@ -290,8 +291,7 @@ We anecdotally determined that when trained at lower steps the encoded hand mode
290
  with gr.Column():
291
  output_image = gr.Gallery(label='Output Image', show_label=False, elem_id="gallery").style(grid=2, height='auto')
292
 
293
- if model_type=="Standard":
294
- gr.Examples(
295
  examples=[
296
  [
297
  "a woman is making an ok sign in front of a painting",
@@ -324,40 +324,6 @@ We anecdotally determined that when trained at lower steps the encoded hand mode
324
  fn=infer,
325
  cache_examples=True,
326
  )
327
- elif model_type=="Hand Encoding":
328
- gr.Examples(
329
- examples=[
330
- [
331
- "a woman is making an ok sign in front of a painting",
332
- "longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
333
- "example.png"
334
- ],
335
- [
336
- "a man with his hands up in the air making a rock sign",
337
- "longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
338
- "example1.png"
339
- ],
340
- [
341
- "a man is making a thumbs up gesture",
342
- "longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
343
- "example2.png"
344
- ],
345
- [
346
- "a woman is holding up her hand in front of a window",
347
- "longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
348
- "example3.png"
349
- ],
350
- [
351
- "a man with his finger on his lips",
352
- "longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
353
- "example4.png"
354
- ],
355
- ],
356
- inputs=[prompt_input, negative_prompt, input_image, model_type],
357
- outputs=[output_image],
358
- fn=infer,
359
- cache_examples=False, #cache_examples=True,
360
- )
361
 
362
  inputs = [prompt_input, negative_prompt, input_image, model_type]
363
  submit_btn.click(fn=infer, inputs=inputs, outputs=[output_image])
 
261
  </td>
262
  </tr></table>
263
  </center>
264
+
265
  - The last option was to use [MediaPipe Holistic](https://ai.googleblog.com/2020/12/mediapipe-holistic-simultaneous-face.html) to provide pose face and hand landmarks to the ControlNet. This method was promising in theory, however, the HaGRID dataset was not suitable for this method as the Holistic model performs poorly with partial body and obscurely cropped images.
266
 
267
  We anecdotally determined that when trained at lower steps the encoded hand model performed better than the standard MediaPipe model due to implied handedness. We theorize that with a larger dataset of more full-body hand and pose classifications, Holistic landmarks will provide the best images in the future however for the moment the hand encoded model performs best. """)
 
291
  with gr.Column():
292
  output_image = gr.Gallery(label='Output Image', show_label=False, elem_id="gallery").style(grid=2, height='auto')
293
 
294
+ gr.Examples(
 
295
  examples=[
296
  [
297
  "a woman is making an ok sign in front of a painting",
 
324
  fn=infer,
325
  cache_examples=True,
326
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
327
 
328
  inputs = [prompt_input, negative_prompt, input_image, model_type]
329
  submit_btn.click(fn=infer, inputs=inputs, outputs=[output_image])